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A  Multiple-Target  Tracking  Algorithm  has  been  designed  and 
tested  for  use  with  0IFAR isonobuoys.  The  algorithm  is  implemented  as  a 
functionally  modular  computer  program  composed  of  five  main  subroutines. 
Operating  In  near  real  time  and  in  less  than  64K  words  of  memory,  the  algor¬ 
ithm  is  capable  of  separating  raw  measurements  into  data  sets  corresponding 
to  Individual  targets  at  the  sensor  level  (using  cluster  analysis),  corre¬ 
lating  target  measurements  across  sensors  (using  physical  constraint  and 
statistical  tests),  and  selecting  the  most  likely  track  scenario  among  the 
various  potential  scenarios  (using  Integer  programming).  The  process  requires 

no  operator  decision  and  no  a  priori  information  about  the  number  or  initial 

9^ - 

conditions  of  the  targets.  The  algorithm  is  self  initializing  from  the  raw 
buoy  data.  Using  synthetic  data,  the  algorithm  was  tested  on  several  multi - 
target  scenarios  with  excellent  results. 


TABLE  OF  CONTENTS 


Section  Page 

EXECUTIVE  SUMMARY  1 

1.0  INTRODUCTION  6 

1.1  Summary  of  Past  Work  6 

1.2  Results  from  Current  Investigation  9 

1.2.1  Selection  of  Clustering  Algorithm  10 

1.2.2  Decision  Rules  for  Automatic  Cluster  12 

Extraction 

1.2.3  Initial  Guess  Procedure  13 

1.2.4  Physical  and  Statistical  Constraints  on  14 

the  Initial  Guess  Estimates 

1.2.5  Integer  Programming  and  Optimal  Track  16 

Scenario 

1.2.6  Scenario  Simulation  Results  16 

1.3  Report  Organization  17 

2.0  DATA  SORTING  BY  CLUSTER  ANALYSIS  19 

2.1  Description  of  Multi-Target  Scenario  Data  19 

2.2  Moody  and  Jardine's  Overlapping  20 

Single  Linkage  Clustering  Algorithm 

2.3  Selection  of  Ling's  (l,r)  Clustering  22 

Algorithm 

2.4  Data  Normalization  24 

3.0  AUTOMATIC  CLUSTER  EXTRACTION  27 

3.1  Single  Linkage  Clustering  and  Cluster  27 

Formations 

3.2  The  Isolation  Index  30 

3.3  Isolation  Index  Distribution  -  Survival  31 

Function 

3.4  Polarization  Test  and  Cluster  Extraction  32 

3.5  Internal  Consistency  and  Cluster  40 

Extraction 


1 


"k>  Js ^ 


TABLE  OF  CONTENTS  (CONTINUED) 


Section 


Page 


3.5.1 

3.5.2 
3.6 
4.0 
4.1 


6.4 

7.0 

7.1 

7.2 

7.2.1 

7.2.2 

7.2.3 

7.3 

7.3.1 

7.3.2 


Regression  and  Internal  Consistency 
Weighting  Scheme 

Automatic  Cluster  Extraction  Summary 
INITIAL  GUESS  PROCEDURE 

Crossed-Bearlng ,  Crossed -Frequency  Initial 
Guess  Procedure 

Measurement  Weighting  Equations 
PHYSICAL  AND  STATISTICAL  CONSTRAINTS  ON 
THE  INITIAL  GUESS  ESTIMATES 
Physical  Constraints  Test 
Two  Sensor  Statistical  Compatablllty  Test 
Three  Sensor  Statistical  Compatibility 
Test 

INTEGER  PROGRAMMING  AND  SCENARIO  SELECTION 
Set  Partitioning  Formulation 
Applying  Set  Partitioning  to  the  Scenario 
Selection  -  Constraints 
Applying  Set  Partitioning  to  Scenario 
Selection  -  Cost  Function 
Integer  Programming  Summary 
RESULTS 
Introduction 

Scenario  Data  Generation 
Sensor  Locations 
Target  Trajectories 
Simulated  DIFAR  Multi-Target  Data 
Three  Target  Scenario,  Good  Geometry, 
Strong  Signal 
Introduction 

Data  Clustering  Results  -  Buoys  1  and  3 


Section 


TABLE  OF  CONTENTS  (CONCLUDED) 


7.3.3 

7.3.4 

7.3.5 

7.3.6 

7.4 

7.5 

7.6 

8.0 

8.1 

8.2 

9.0 


Pag 


Cluster  Extraction  Results  --  Buoy  3  and  111 

Buoy  1 

Initial  Guess  Results  111 

Integer  Programming  Results  117 

Conclusion  117 

Three  Target  Scenario  with  Bad  Geometry  120 

Low  Noise,  0  dB  threshold  in  1  Hz  Band 
Three  Target  Scenario  with  Good  Geometry,  134 

Moderate  Noise,  No  Threshold 
Two  Target  Scenario  with  Good  Geometry,  Low  138 
Noise,  and  0  dB  in  a  1  Hz  Band  Threshold 
SUMMARY  AND  RECOMMENDATIONS  142 

Project  Summary  142 

Recommendations  for  Future  Work  142 

REFERENCES  147 


1H 


LIST  OF  FIGURES 


Figure 


1.1 

2.1 

3.1 

3.2 

3.3 

3.4 
5.1 


5.2 

7.1 


7.2 

7.3 


7.4 


7.5 


8.1 


Top-Down  Flowchart  of  Tracor's  Multi-Target  11 

Tracking  Algorithm  (MTTA) 

Example  of  Two-overlapping  Clusters  20 

Clustering  Example  29 

Bearing /Frequency  Plot  Examples  34 

Plots  to  Determine  Polarization  Test  39 

Cut  Off  Point 

Flowchart  of  the  Internal  Consistency  Test  41 

Plot  of  the  Line-of-Sight  Bearing  Inter-  57 

sections  Resulting  from  the  Crossed- 
Bearlngs  Procedure  for  Scenario  1 
Initial  Guess  Procedure's  3-Sensor  65 

Combination  Tests 

Plot  of  the  MTTA's  Estimated  Tracks  vs.  88 

the  True  Tracks  for  the  3-Target , 

Strong  Signal  and  Good  Geometry  Scenario 
Integer  Programming  Equation  for  118 

Scenario  1 

Plot  of  the  MTTA's  Estimated  Tracks  vs.  121 

the  True  Tracks  for  the  3 -Target  Strong 
Signal,  and  Poor  Geometry  Scenario 
Plot  of  the  MTTA's  Estimated  Tracks  vs.  135 

the  True  Tracks  for  the  3 -Target,  Weak 
Signal  and  Good  Geometry  Scenario 
Plot  of  the  MTTA's  Estimated  Tracks  vs.  139 

the  True  Tracks  for  the  2 -Target, 

Strong  Signal  and  Good  Geometry  Scenario 
Logical  Flowchart  of  Tracor's  Proposed  145 

Final  MTTA 


1  v 


lac*  AppM  Sciences 


LIST  OF  TABLES 


TABLE  Page 

7.1  DESCRIPTION  OF  THE  3  TARGET  TRACKS  FOR  82 

SCENARIO  1 

7. II  DESCRIPTION  OF  THE  3  TARGET  TRACKS  FOR  83 

SCENARIO  2 

7. Ill  DESCRIPTION  OF  THE  2  TARGET  TRACK  FOR  84 

SCENARIO  4 

7.  IV  SIMULATED  MULTI -TARGET  DIFAR  DATA  FOR  BUOY  89 

1  OF  SCENARIO  1 

7.V  SIMULATED  MULTI-TARGET  DIFAR  DATA  FOR  BUOY  91 

3  OF  SCENARIO  1 

7. VI  CLUSTERING  OUTPUT  FOR  BOUY  1  OF  SCENARIO  1  94 

7. VII  CLUSTERING  OUTPUT  FOR  BUOY  3  OF  SCENARIO  1  98 

7. VIII  SURVIVAL  FUNCTION  AND  ISOLATION  INDEX  VALUES  103 

FOR  CLUSTERS  FROM  BUOY  1  OF  SCENARIO  1 

7. IX  SURVIVAL  FUNCTION  AND  ISOLATION  INDEX  VALUES  107 

FOR  CLUSTERS  FROM  BUOY  3  OF  SCENARIO  1 

7.X  BUOY,  CLUSTER,  AND  TARGET  NUMBER  CORRE-  112 

SPONDENCE  TABLE 

7. XI  INITIAL  GUESS  AND  CONSTRAINTS  RESULTS  114 

FOR  SCENARIO  1 

7. XII  SIMULATED  MULTI-TARGET  DIFAR  DATA  FOR  122 

BUOY  1  OF  SCENARIO  2 

7. XIII  CLUSTERING  OUTPUT  FOR  BUOY  1  OF  SCENARIO  1  124 

7. XIV  SIMULATED  MULTI-TARGET  DIFAR  DATA  FOR  BUOY  128 

3  OF  SCENARIO  2 

7. XV  CLUSTERING  OUTPUT  FOR  BUOY  3  OF  SCENARIO  2  130 


EXECUTIVE  SUMMARY 


During  the  past  year  Tracor  created  and  tested  a 
Multiple-Target  Tracking  Algorithm  (MTTA)  designed  for  use  with 
DIFAR  sonobuoys .  Operating  in  near  real-time  and  in  less  than 
64  K  words  of  memory,  the  algorithm  is  capable  of  separating 
raw  measurements  into  data  sets  corresponding  to  individual 
targets  at  the  sensor  level,  correlating  target  measurements 
across  sensors,  and  selecting  the  most  likely  track  scenario 
among  the  various  potential  scenarios.  The  process  requires  no 
operator  decision  and  no  a  priori  Information  about  the  number 
or  initial  conditions  of  the  targets.  The  algorithm  is  self 
initializing  from  the  raw  buoy  data.  Using  synthetic  data,  the 
algorithm  was  tested  on  several  multi-target  scenarios  with 
excellent  results.  This  executive  summary  presents  an  overview 
of  work  accomplished  on  the  MTTA  during  the  past  year. 
Included  is  a  brief  description  of  the  algorithm's  structure, 
conclusions  of  the  study,  and  recommendations  for  further  work. 

The  MTTA  is  implemented  as  a  functionally  modular 
computer  program  composed  of  five  main  subroutines.  At  present 
all  processing  is  serial;  however,  with  little  effort  the 
algorithm  could  be  implemented  in  a  parallel  processing  scheme. 

Sorting  Data  at  Sensor  Level-Cluster  Analysis. 
When  plots  of  frequency  versus  bearing  measurements  from  a 
DIFAR  sensor  are  constructed,  the  noise  free  data  lie  along 
either  straight  or  curvilinear  lines.  For  targets  that  are 
moderately  separated  in  bearing  or  frequency,  the  associated 
lines  will  be  quite  distinct.  Based  on  these  observations, 
experiments  were  conducted  to  test  the  usefulness  of  cluster 
analytic  techniques  to  separate  target  data  at  the  sensor 
level.  They  pi  oved  to  be  successful  and  Ling's  (l,r)  algorithm 
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(equivalent  to  a  single -linkage  clustering  algorithm)  was  used 
to  separate  data  at  the  sensor  level  into  potential  target  data 
sets . 


Automatic  Cluster  Extraction.  While  cluster 
analysis  worked  well  in  separating  observations  into  proper 
target  data  sets,  the  structure  of  the  algorithm  made  it 
difficult  to  distinguish  these  sets  from  the  others  created  by 
the  algorithm.  This  is  a  standard  and  much  discussed  problem 
in  cluster  analysis.  Tracor  was  able  to  develop  a  three  stage 
attack  on  the  problem  which  has  proven  to  be  quite  successful. 
The  first  step  is  to  use  Ling's  isolation  index  to  select 
clusters  which  are  well  isolated  from  the  rest  of  the  data. 
Next,  the  clusters  are  checked  for  randomness  and  only  those 
that  are  distinctly  nonrandom  are  selected.  Lastly,  a 
qualitative  regression  procedure  is  used  to  check  clusters  with 
subsets  for  internal  consistency.  Only  those  clusters  which 
are  Isolated,  nonrandom,  and  internally  consistent  are  selected 
as  target  data  sets. 

Intersensor  Correlation-Initial  Guess  Procedure 
and  Constraint  Rules .  To  correlate  data  from  more  than  one 
sensor,  a  crossed -bearing,  crossed -frequency,  batch  type 
algorithm  is  used  to  provide  initial  state  vector  estimates 
from  the  data  contained  in  the  various  two-sensor  and 
three-sensor  cluster  combinations.  These  estimates  are  then 
subjected  to  a  series  of  physical  and  statistical  based  tests 
to  determine  their  validity.  The  physical  tests  include  checks 
on  reasonableness  of  position  and  velocity  estimates.  The 
statistical  tests  use  qualitative  regression  procedures  to  test 
the  two  sensor  estimates  for  compatability ,  and  Gallant's 
non-linear  regression  test  to  compare  a  three  sensor  estimate 


Itacor  Applied  Sciences 


and  its  associated  two  sensor  estimates  for  consistency. 
Intersensor  cluster  combinations  which  pass  all  tests  are  then 
considered  as  potential  target  tracks. 


Scenario  Construction 


Integer  Programming . 


Once  the  potential  target  track  set  has  been  delineated ,  all 
that  remains  is  to  construct  the  set  of  tracks  which  provide 
the  most  likely  scenario  description.  To  solve  this  problem, 
Tracor  has  used  a  0-1  integer  programming  set  partitioning 
procedure  to  sift  through  the  large  number  of  potential 
scenarios  to  select  the  most  likely  one.  To  devise  a  cost 
function,  each  track  is  fed  into  Tracor' s  hybrid  tracking 
algorithm  and  the  value  of  the  likelihood  function  for  the 
associated  measurement  model  residual  stream  is  computed.  The 
likelihood  function  for  a  scenario  is  based  on  the  likelihood 
function  associated  with  each  track,  and  finding  the  optimal 
scenario  is  equivalent  to  maximizing  the  likelihood  function 
over  the  set  of  all  possible  scenarios. 


The  MTTA  was  tested  on  four  scenarios  of  varying 
difficulty;  three  of  the  scenarios  contained  three  targets 
while  one  of  the  scenarios  contained  only  two  targets.  For 
three  of  the  scenarios,  all  targets  were  detected,  sorted,  and 
good  track  estimates  were  generated.  In  the  remaining 
scenario,  two  of  the  three  targets  were  detected  and  tracked, 
but  the  third  target  was  only  partially  detected  and  no  track 
was  generated.  The  following  conclusions  were  drawn  from  this 
study : 

1)  Observation  geometry  was  the  single  most 
important  variable  for  successful  detection 
and  tracking  of  the  targets  by  the  MTTA, 
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2)  Strong  signal -to -noise  ratio  is  important, 
but  not  crucial  to  the  successful  operation 
of  the  MTTA,  and 

3)  Execution  time  for  the  MTTA  varies  from 
scenario  to  scenario  based  on  the  complexity 
of  the  trajectories.  All  scenarios  were  of 
six  minute  duration,  and  the  MTTA  took  from 
seven  to  nine  minutes  to  run  through  them. 
Minimal  parallel  processing  could  turn  the 
MTTA  into  a  faster  than  real  time  batch 
processing  scheme. 

The  recommendations  for  future  work  fall  into  two 
categories  --  refinement  of  clustering  and  extraction 

procedures  and  MTTA  extensions.  Under  the  first  category,  the 
following  topics  were  identified  as  algorithm  deficiencies  that 
need  to  be  investigated: 

1)  Examination  of  alternative  clustering 

approaches  (CLASSY,  Anderberg's  overlapping 

algorithm) . 

2)  Using  quadratic  terms  in  the  cluster 
extraction  regression  to  model  CPA. 

3)  Addition  of  a  clustering  attribute  based  on 
power  or  SNR  to  help  sort  targets. 

In  the  second  category,  all  multi-target  schemes  must  deal  with 
the  question  of  whether  a  measurement  belongs  to  currently 
existing  tracks  or  is  it  part  of  a  track  that  has  yet  to  be 
initialized.  Tracor's  proposed  implementation  of  a  complete 
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multi-target  tracking  algorithm  would  address  this  question  as 
soon  as  a  data  point  was  acquired.  If  it  was  determined  that 
the  data  point  was  not  part  of  a  current  track,  then  it  would 
be  put  into  a  data  pool  and  at  designated  times  the  MTTA  would 
be  activated  to  search  for  and  initialize  new  tracks.  The 
extension  to  the  MTTA  would  consist  of  the  decision  rules 
required  to  ask  whether  a  point  belongs  to  a  track  or  not  and 
then  the  procedures  required  to  update  a  track.  An  additional 
task  would  be  to  investigate  whether  smoothing  of  raw  data 
would  help  both  in  tracking  and  updating. 

In  summary,  Tracor  has  developed  a  self 
initializing  multi-target  tracking  algorithm  that  has  performed 
well  with  simulated  DIFAR  data.  It  runs  in  nearly  real  time 
and  requires  less  than  64  K  words  of  memory.  Although  in  this 
study  the  MTTA  used  Tracor' s  Hybrid  Tracking  Algorithm,  any 
single  target  tracking  algorithm  could  be  used  to  generate 
scores  for  each  potential  track.  What  has  been  created  is  a 
superstructure  which  sorts  and  makes  decisions  Independently  of 
the  particular  target  tracker  used.  Planned  refinements  and 
extensions  to  the  algorithm  would  give  it  the  ability  to  run 
for  long  periods  of  time  providing  a  real-time  picture  of  the 
current  tracking  scenario.  It  is  felt  that  the  MTTA  represents 
a  promising  step  towards  a  fully  automated  DIFAR  based, 
multi-target  tracking  algorithm. 
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1.0  INTRODUCTION 

For  many  years,  the  U.S.  Navy  has  been  concerned 
with  the  problems  of  detecting,  classifying,  and  tracking 
submerged  targets  encountered  in  anti-submarine  warfare  (ASW) 
engagements.  One  of  the  problems  which  has  been  extensively 
investigated  at  Tracor  is  underwater  target  tracking  with  data 
gathered  by  deployed,  passive  sonobuoy  detection  systems.  Past 
efforts  have  concentrated  on  the  problem  of  tracking  a  single 
target.  In  recent  years,  however,  greater  emphasis  has  been 
placed  on  developing  techniques  which  can  use  passive  data  to 
track  multiple  targets  simultaneously.  When  no  a  priori 
information  is  available  concerning  target  numbers  or 
trajectories,  the  multiple  target  tracking  problem  becomes  very 
difficult  to  solve.  This  report  contains  the  findings  from  the 
past  year  of  Tracor' s  efforts  to  address  the  problem  of 
tracking  multiple  targets  with  passive  data  when  no  a  priori 
Information  is  available.  Under  the  current  contract  a 
multi-target  tracking  algorithm  was  developed  which  performed 
very  well  on  simulated  multi-target  DIFAR  data.  The  algorithm 
was  able  to  properly  sort  data  at  the  sensor  level,  correlate 
data  across  sensors  and  reconstruct  the  tracks  under 
consideration.  It  is  felt  that  a  positive  step  has  been  taken 
in  creating  an  effective  DIFAR  based  multi-target  tracking 
algorithm. 


1.1  Summary  of  Past  Work 

In  the  past,  Tracor  has  studied  both  the  single 
and  the  multiple  target  tracking  problems.  Initial  emphasis 
was  placed  on  developing  a  quick  and  accurate  single  target 
tracking  algorithm.  This  work  led  first  to  Tracor' s 
development  of  a  batch-type  tracker  known  as  the  Maximum 
Likelihood  Estimator  (MLE,  Reference  1).  The  MLE  was  designed 
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to  initialize  tracking  solutions  when  no  a  priori  information 
was  provided  and  to  continue  tracking  a  single  target  until  all 
the  data  have  been  processed*  However,  due  to  the  batch  nature 
of  the  tracker,  the  MLE  was  forced  to  iterate  through  four 
different  motion  models  to  account  for  possible  target 

maneuvers  and  to  use  sophisticated  statistical  techniques  to 
automatically  choose  the  most  appropriate  motion  model. 
Unfortunately,  this  process  was  found  to  be  both  cumbersome  and 
time  consuming. 

In  an  effort  to  develop  a  quicker  and  more 
efficient  tracker,  Tracor  then  developed  an  Extended  Kalman 
Filter  (EKF,  References  2  and  3)  for  single  target  tracking 
applications.  Since  the  EKF  was  a  sequential  tracker,  a 
single,  less  sophisticated  motion  model  was  used  with  process 
noise  Incorporated  into  the  tracker  to  compensate  for  any 

unmodeled  trajectory  changes.  This  single  motion  model 
substantially  reduced  the  complexity  of  the  tracker  and  helped 
to  greatly  speed  up  its  execution  time.  Furthermore,  if 

reasonably  good  a  priori  Information  was  available  for  the 

target's  state,  the  EKF  was  not  only  much  quicker,  but  it  was 
at  least  as  accurate  as  the  MLE.  Unfortunately,  when  no  a 
priori  state  Information  was  available,  the  EKF  frequently 
failed  to  converge  upon  an  adequate  track  of  the  target's 
trajectory,  and  thus  could  not  be  counted  on  to  track  targets 
under  all  conditions. 

Since  a  quick  but  accurate  tracker  was  desired 
for  single  target -tracking  applications,  Tracor  also  developed 
a  tracker  which  utilized  the  best  characteristics  of  both  the 
MLE  and  the  EKF.  This  led  to  the  development  of  Tracor 's 
Hybrid  Tracking  Algorithm  (HTA,  Reference  2  and  3) .  The  HTA 
utilized  a  batch  type  starter  to  initialize  the  tracker  and 
then,  after  initialization  had  been  successfully  completed, 
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automatically  switched  to  a  sequential  tracker  to  continue 
updating  the  target's  trajectory.  This  tracker  was  found  to 
provide  the  accuracy  needed  to  initialize  a  trajectory  when  no 
a  priori  information  was  available ,  and  to  track  as  quickly 
and  as  accurately  as  the  EKF  after  initialization  was 
completed.  The  HTA  was  then  chosen  as  Tracor's  final  product 
for  single  target  tracking  applications. 

After  completing  the  development  of  the  HTA  as  a 
8 ingle  target  tracker,  Tracor  became  involved  in  the 
development  of  a  multiple  target  tracking  algorithm  (MTTA) . 
This  algorithm  was  to  be  developed  to  handle  the  very  difficult 
problem  of  tracking  multiple  targets  when  no  a  priori 
information  was  available  concerning  target  numbers  or 
trajectories.  Furthermore,  only  passive  frequency  and  bearing 
data  (the  data  types  commonly  available  from  deployed 
sonobuoys)  were  to  be  used  to  track  the  targets.  In  the  past, 
several  authors  have  developed  multiple  target  trackers  which 
used  a  priori  Information  or  active  data  types  such  as  range 
measurements  to  initialize  the  tracker,  but  very  few  have 

examined  the  more  common  and  much  more  difficult  problem  of 
tracking  multiple  targets  with  only  passive  data  and  no  a 

priori  information. 

Last  year,  (Reference  3)  the  development  of  the 
MTTA  was  broken  up  into  two  phases.  The  first  phase  was 
designed  to  enhance  the  HTA's  performance  so  it  could  be 

readily  modified  for  further  use  in  the  MTTA.  As  envisioned, 
the  HTA  would  serve  as  the  core  tracking  algorithm  which  would 
be  used  to  track  the  various  targets  present  in  a  given 

scenario. 
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Once  the  improvements  to  the  HTA  were  completed, 
an  attack  on  the  second,  more  difficult,  problem  of  data 
sorting  was  begun.  For  the  multiple  target  tracking  problem, 
some  means  was  needed  to  sort  the  data  received  from  individual 
sensors  into  sets  of  individual  target  data.  Without  any  a 
priori  information  concerning  the  targets  that  were  present, 
none  of  the  traditional  "gating"  or  "nearest  neighborhood" 
approaches  could  be  used  to  sort  the  data.  After  studying 
frequency  versus  bearing  time  histories  of  the  data,  it  was 
decided  to  try  to  use  cluster  analysis  to  sort  the  multiple 
target  data  into  single  target  data  sets.  After  several 
trials ,  single “linkage  cluster  analysis  was  found  to  be  fairly 
successful  in  sorting  the  data  when  the  time  tag,  estimated 
frequency,  and  the  cosine  and  sine  of  the  bearing  estimates 
were  used  as  object  attributes.  However,  the  results  of  the 
cluster  analysis  program  were  output  as  tree  diagrams  or 
dendrograms  which  required  user  interpretation  to  correctly 
sort  the  data.  For  the  cases  studied,  all  the  multiple  target 
data  were  generated  by  user  dictated  simulations,  so  it  was 
known  what  type  of  behavior  to  look  for  in  the  data.  With  this 
advantage,  it  was  possible  to  interpret  the  tree  diagram 
outputs  such  that  individual  target  data  sets  could  be  readily 
found.  Without  the  benefit  of  this  knowledge,  it  is  doubtful 
that  the  tree  diagrams  could  have  been  used  to  find  the  correct 
clusters  of  data.  From  these  results  it  was  decided  that 
cluster  analysis  shoved  good  potential  for  sorting  multiple 
target  data  into  individual  target  data  sets,  but  a  suitable 
cluster  extraction  scheme  had  to  be  developed. 

1.2  Results  from  Current  Investigation 


Tracor  has  continued  its  multiple  target  tracking 
investigation  through  the  current  year.  During  this  time, 
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Tracor's  MTTA  has  evolved  into  a  large  scale  program  containing 
five  separate  modules*  A  top-down  flowchart  of  these  modules, 
and  a  brief  description  of  each  one  is  given  in  Figure  1*1. 
The  following  sections  summarize  the  work  that  has  been 
performed  this  year  in  each  of  the  module  areas. 

1.2.1  Selection  of  Clustering  Algorithm 

During  this  year's  investigation  three  different 
clustering  algorithms  were  examined:  the  single -linkage 
clustering  algorithm  from  CLUSTAR,  Ling's  generalized  (k,r) 
algorithm,  and  Moody  and  Jardlne's  single -linkage, 
non-hierarchical  overlapping  algorithm.  Euclidean 
distances  were  used  in  all  algorithms  to  generate  the  necessary 
resemblance  matrix  of  dissimilarity  coefficients.  The 
overlapping  algorithm  allows  different  clusters  to  contain  one, 
two  or  several  points  In  common  while  the  clusters  maintain 
their  own  separate  Identity.  On  the  other  hand,  the 
non-overlapping  algorithm  requires  that  the  data  points  that 
created  the  overlap  in  the  algorithm  be  in  one  cluster  or 
the  other  or  the  non -overlapping  technique  will  force  the  two 
clusters  to  merge  into  one  cluster  at  these  points. 

Initially,  it  had  been  felt  that  the  B^ 
overlapping  algorithm  might  provide  some  flexibility  that  would 
be  useful  for  solving  the  data  sorting  problem.  However,  in 
the  cases  studied ,  the  non-overlapping  techniques  have  been 
found  to  be  adequate  for  sorting  the  data.  Furthermore,  both 
the  non-overlapping  algorithms  were  found  to  be  substantially 
faster  than  the  overlapping  algorithm.  It  was  therefore 
decided  to  remain  with  the  single-linkage,  non -overlapping 
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Figure  1.1.  Top-Down  Flowchart  of  Tracor's 

Multi-Target  Tracking  Algorithm  (MTTA) 
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approach  Chat  had  been  used  previously.  0£  the  two  remaining 
algorithms  that  were  under  consideration,  it  was  decided  to  use 
Ling's  (l,r)  algorithm  rather  than  his  generalized  (k,r) 

4 

algorithm  or  CLUSTAR' s  single  linkage  algorithm  because: 

1)  With  k"l,  Ling's  (k,r)  algorithm  was 
equivalent  to  a  single-linkage  algorithm  and 
produced  the  same  results  as  CLUSTAR's 
algorithm, 

2)  It 8  output  was  more  accessible  and  easier  to 
use  than  the  tree  diagrams  from  CLUSTAR,  and 

3)  Most  Importantly,  Ling  has  developed  a 
quantitative  measure  for  the  (l,r)  algorithm 
that  was  found  to  be  useful  in  implementing 
a  procedure  for  automatic  cluster  extraction. 


1.2.2 


Decision  Rules  for  Automatic  Cluster  Extraction 


One  of  the  problems  seen  last  year  with  the 
8 ingle-linkage  clustering  algorithm  was  that  it  continued  to 
link  or  chain  all  the  data  and  the  clusters  together  until 
ultimately,  all  the  data  were  chained  together  in  one  large, 
all-inclusive  cluster.  Naturally,  the  sorted  data  for  the 
individual  targets  were  usually  found  in  smaller,  subclusters 
contained  as  subsets  of  the  large  final  cluster.  In  last 
year's  report,  available  knowledge  concerning  the  simulated 
scenarios  was  used  to  aid  in  picking  the  correct  smaller 
clusters  from  the  tree  diagram  outputs.  This  year  it  was 
sought  to  automate  this  cluster  extraction  process.  In  doing 
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so  three  different  decision  rules  have  been  established  to 
automatically  extract  the  appropriate  clusters.  The  three 
rules  or  tests  adopted  include: 

1)  Ling's  isolation  index  is  used  to  identify 

potentially  useful  clusters.  It  identifies 
clusters  which  are  well  Isolated  from  other 
clusters ,  implying  that  the  data  in  the 
isolated  cluster  have  little  similarity  with 
the  data  in  other  clusters, 

2)  A  polarization  test  for  data  randomness 

based  on  work  done  by  Alam  and  Mitra 

(Reference  4)  which  seeks  to  detect  and 
eliminate  noisy  clusters,  l.e.  those 
clusters  containing  a  high  percentage  of 
points  corresponding  to  background  noise  and 
not  signal  measurements,  and 

3)  Lastly,  a  regression  based  test  is  used  to 

make  the  final  identifications  of  those 
isolated  clusters  corresponding  to  likely 
target  data  sets. 

1.2.3  Initial  Guess  Procedure 

After  the  multiple  target  data  from  individual 
sensors  have  been  sorted  by  cluster  analysis  into  individual 
target  data  sets,  these  data  clusters  must  be  properly  matched 
with  clusters  from  other  sensors  to  construct  tracks  for  the 
individual  targets.  With  only  passive  frequency  and  bearing 
measurements,  it  is  usually  impossible  to  track  a  target  with 
data  from  just  one  sensor.  Typically,  data  from  two  or  three 
sensors  with  other  observation  geometries  must  be  used  to  track 
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a  given  target.  For  the  multi-target  case,  several  clusters 
from  each  of  the  sonobuoys  will  be  available  to  try  to  track  a 
target.  Ultimately,  one  must  find  the  right  combination  of 
clusters  from  the  different  sensors  that  is  needed  to  track 
each  of  the  individual  targets.  After  the  individual  clusters 
from  each  sensor  have  been  established,  the  next  step  is  to  get 
a  rough  idea  of  which  combination  of  clusters  can  potentially 
be  used  to  build  tracks.  For  this  purpose,  a  cros sed -bearing , 
crossed -frequency  least  squares  procedure  has  been  employed  to 
generate  an  estimate  of  the  initial  target  state  for  each 
possible  combination  of  matched,  Intersensor  clusters.  This 
procedure  has  been  used  previously  (Reference  3)  to  generate  a 
starting  point  for  the  HTA's  initializer  for  use  in  tracking 
single  targets.  It  has  been  found  that  with  some  slight 
modifications,  this  same  cros sed -bearing ,  cros sed -frequency 
procedure  may  be  used  to  provide  a  cursory  look  at  how  well  the 
intersensor  combination  of  data  clusters  go  together  to 
provide  estimates  of  potential  initial  conditions  for  the 
targets.  As  will  be  described  next,  the  rough  estimates 
produced  here  can  be  examined  to  see  if  they  meet  certain 
physical  and  statistical  constraints.  For  those  estimates  that 
fall  the  constraints  tests,  their  corresponding  Intersensor 
cluster  combination  can  be  eliminated  from  further 
consideration.  This  then  helps  to  reduce  the  scope  of  the 
problem  under  consideration  before  it  passes  on  to  the  next 
MTTA  module. 

1.2.4  Physical  and  Statistical  Constraints  on  the 

Initial  Guess  Estimates 

As  stated,  the  initial  guess  estimates  must  meet 
certain  constraints  before  they  are  passed  for  further 
consideration  by  Tracor's  MTTA.  The  first  test  is  to  see  if 
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Che  position  estimate  falls  within  the  physical  limits  of  the 
sonobuoy's  observation  range  and  if  the  target's  velocity 
estimate  falls  within  the  range  of  allowable  values  for  target 
trajectories.  For  instance ,  if  the  range  for  the  day  of  a 
given  sonobuoy  is  set  at  10,000  meters,  one  could  safely  reject 
any  initial  guess  estimates  that  would  place  a  target  30,000 
meters  from  the  sonobuoy.  Similarly,  if  the  initial  guess  of 
target  velocity  falls  well  outside  the  known  range  of  values 
that  are  possible  for  target  trajectories,  one  could  safely 
eliminate  the  intersensor  cluster  combination  that  yielded  that 
estimate.  For  the  physical  constraints  routine  used  in 
Tracor's  MTTA,  the  user  inputs  the  maximum  sonobuoy  detection 
range  and  the  maximum  target  speed,  and  the  program  rejects  all 
combinations  whose  estimates  are  greater  than  1.5  times  the 
allowable  maximum  values.  This  allows  a  50%  error  to  be 
present  in  Initial  guess  estimates  while  still  accepting 
estimates  from  the  upper  range  of  allowed  values.  The  initial 
guess  procedure  has  been  found  to  be  more  accurate  than  this 
50%  error  tolerance,  but  this  tolerance  permits  a  safety 
cushion  which  prevents  possible  intersensor  cluster 
combinations  from  being  rejected  prematurely.  The  second  set 
of  constraints  applied  to  the  initial  guess  estimator  consists 
of  two  statistical  tests  that  measure  the  consistency  of  the 
estimates  produced  by  two  or  three  sensors.  The  first  test 
uses  indicator  variables  (References  5  and  6)  to  determine  if 
both  sensors  in  a  two  sensor  initial  guess  estimate  are 
providing  statistically  compatible  information.  Once  all 
two-sensor  combinations  have  been  examined,  those  three-sensor 
combinations  which  contain  at  least  two  "good"  two-sensor 
combinations  are  tested.  A  nonlinear  regression  test  developed 
by  Gallant  (Reference  7)  is  applied  to  the  data  to  determine  if 
the  three-sensor  initial  guess  estimate  is  equivalent  to  the 


various  possible  two-sensor  estimates.  At  least  two  of  the 
three  possible  two-sensor  estimates  must  be  equivalent  to  the 
three-sensor  estimate  for  the  three-sensor  inital  guess 
estimate  to  pass  the  test. 


1.2.5 


Integer  Programming  and  Optimal  Track  Scenario 


Once  the  initial  guess  and  constraints  procedures 
have  selected  a  set  of  potential  target  data  sets  P,  the  only 
remaining  problem  is  to  determine  the  subset  of  P  that  provides 
the  "optimum"  track  scenario.  In  the  MTTA,  the  optimum 
scenario  is  the  one  which  maximize  the  value  of  the  likelihood 
function  over  the  set  of  potential  scenarios.  Clearly,  this  is 
a  combinatorial  problem  that,  even  for  moderate  size  problems, 
possesses  a  large  number  of  possible  solutions.  In  a  fashion 
analagous  to  Morefield  (Reference  8) ,  Tracor  has  chosen  to  pose 
this  problem  as  a  0-1  set  partitioning  problem  using  integer 
programming  techniques  to  sift  through  the  various  possible 
track  scenarios  to  find  the  optimal  set  of  tracks. 

1.2.6  Scenario  Simulation  Results 

With  one  exception,  Tracor' s  MTTA  has 
successfully  tracked  all  the  targets  in  the  multi-target 
scenarios  studied  in  this  investigation.  Efforts  to  automate 
the  cluster  extraction  process  have  been  largely  successful, 
and  the  Initial  guess  and  integer  programming  procedures  have 
worked  quite  well  at  finding  the  right  set  of  trajectories 
needed  to  track  the  multiple  targets.  Four  different  simulated 
scenarios  have  been  used  in  this  investigation,  and  the  MTTA 
successfully  tracked  all  of  the  targets  in  three  of  the  four 
scenarios.  For  the  one  scenario  that  failed,  three  targets 
were  present  but  the  MTTA  only  tracked  two  of  them.  Careful 
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examination  of  the  results  indicated  that  for  this  particular 
scenario  poor  observation  geometry  caused  the  clustering 
algorithm  to  become  confused  and  to  merge  the  data  from  two 
targets  into  one  cluster.  This  confusion  was  caused  when  the 
observational  geometries  for  two  different  sonobuoys  forced  two 
different  targets  to  have  nearly  identical  frequency  and 
bearing  measurements  during  the  same  time  period,  and  the 
non-overlapping  clustering  algorithm  could  not  sort  the  data 
into  individual  target  sets.  In  spite  of  this  problem,  the 
MTTA  was  still  able  to  track  two  of  the  three  targets.  One  of 
the  other  scenarios  tested  the  MTTA's  capability  to  track 
multiple  targets  with  data  gathered  from  a  weak  signal  source 
that  caused  random  noise  to  be  included  in  the  measurement 
set.  For  this  scenario,  the  MTTA  effectively  sorted  the  true 
signals  from  the  noise  and  was  then  able  to  track  all  the 
targets  found  in  the  scenario.  The  other  two  scenarios 
presented  good  geometries  with  strong  signals,  so  the  MTTA  was 
able  to  track  all  the  targets  very  well.  The  results  from 
these  four  scenarios  are  felt  to  offer  representative  examples 
of  how  well  the  MTTA  can  work  in  multi-target  scenarios  when  no 
a  priori  information  is  available  to  initialize  the  tracker. 

1.3  Report  Organization 


Section  2  of  this  report  includes  detailed 
descriptions  of  the  cluster  analysis  work  performed  under  this 
contract.  The  next  section  describes  the  techniques  that  have 
been  used  to  automate  the  cluster  extraction  process  needed  to 
sort  multiple  target  data.  In  the  fourth  section,  the  initial 
guess  procedure  is  described  in  detail.  Section  5  describes 
the  physical  and  statistical  constraints  that  were  used  to 
eliminate  unlikely  Intersensor  cluster  combinations.  The 
Integer  programming  procedure  used  by  Tracor's  MTTA  is 
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described  in  the  next  section,  which  also  includes  details  of 
the  optimization  constraints  and  the  HTA  cost  function  that 
were  used  by  the  optimization  algorithm.  Section  7  contains  a 
summary  of  the  results  obtained  for  the  various  simulated 
scenarios  that  were  examined  during  this  investigation. 
Finally,  Section  8  contains  conclusions  from  this  year's 
research  as  well  as  recommendations  for  future  investigations. 
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2.0  DATA  SORTING  BY  CLUSTER  ANALYSIS 

Cluster  analysis  is  a  field  in  numerical  taxonomy 
that  uses  attributes  which  describe  a  set  of  objects  to  group 
or  sort  those  objects  according  to  the  degree  of  similarity 
between  them.  For  the  data  sorting  problem ,  measurements  from 
an  individual  target  should  exhibit  strong  similarities  in 
their  frequency  and  bearing  estimates  whereas  measurements  from 
different  targets  should  exhibit  strong  dissimilarities  between 
them.  Last  year's  research  established  that  single -linkage 
cluster  analysis  showed  good  potential  for  sorting  multiple 
target  data  into  single  target  data  sets  (Reference  3) .  It  was 
shown  that  frequency  versus  bearing  time  histories  of  the 
measurements  produced  a  rough,  chain-like  curve  for  each  target 
when  the  data  were  plotted  in  three  dimensions .  Because  of  the 
chain-like  behavior  found  in  the  data,  single-linkage 
clustering  algorithms  were  chosen  because  they  are  known  to 
chain  data  together  in  forming  clusters.  Last  years  results 
also  showed  that  Euclidean  distances  between  object  attributes 
were  the  most  useful  for  generating  the  resemblance  matrix  of 
dissimilarity  coefficients  needed  to  cluster  the  data.  This 
section  contains  the  results  of  current  studies  which  sought  to 
find  ways  to  improve  the  usefulness  of  the  single -linkage 
clustering  concept. 

2.1  Description  of  Multi-Target  Scenario  Data 

Before  evaluating  any  clustering  algorithm's 
performance,  some  multi-target  data  had  to  be  obtained  to 
perform  the  data  sorting  tests.  For  this  study,  simulated 
multiple  target  data  were  created  because  no  real  data  were 
readily  available.  Frequency  and  bearing  measurements  for  each 
target  were  output  at  fixed  time  Intervals,  and  these  data  were 
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then  merged  by  time  tags  to  simulate  a  multiple  target  data 
set.  This  procedure  was  performed  for  each  of  the  three 
sonobuoys  found  in  a  scenario,  so  that  at  the  conclusion  of  the 
simulation,  a  separate  multi-target  data  set  was  generated  for 
each  sonobuoy.  The  method  used  to  generate  this  multiple 
target  data  was  identical  to  the  method  used  in  last  year's 
report  to  generate  the  multiple-linetracker  data  (Reference  3). 

2.2  Moody  and  Jardlne's  Overlapping  Single 

Linkage  Clustering  Algorithm 

Three  cluster  analysis  algorithms  were  evaluated 
during  the  current  investigation.  The  first,  Moody  and 
Jardine's  algorithm  (Reference  9),  is  classified  as  an 
overlapping  single-linkage  algorithm.  Overlapping  means  that 
two  clusters  share  some  points  by  overlapping  their  boundaries 
at  these  points,  but  they  also  maintain  separate  boundaries 
away  from  these  points.  An  illustration  of  two  overlapping 
clusters  is  presented  in  Figure  2.1. 


Figure  2.1  -  Example  of  two-overlapping  clusters. 


Note  how  the  boundaries  for  clusters  1  and  2  overlap  at  the 
apex  of  the  data  points  in  Figure  2.1;  they  are  still 
classified  as  separate  clusters,  but  share  five  points  in 
common.  For  non-overlapping  techniques,  the  five  ambiguous 
data  points  in  question  may  have  been  in  either  cluster  1  or 
cluster  2,  they  may  have  been  grouped  in  a  cluster  by 
themselves,  or  they  may  have  served  as  a  link  to  join  all  the 
data  into  one  new  cluster.  However,  the  points  in  question 
could  not  be  contained  in  two  separate  clusters  as  was 
illustrated  in  Figure  2.1. 

It  was  originally  felt  when  the  algorithm 
was  programmed  that  such  an  overlapping  scheme  might  be  useful 
for  cases  when  a  non-overlapping  clustering  algorithm  could  not 
clearly  differentiate  the  data  from  two  different  targets. 
Certain  geometries  are  possible  where  a  given  sensor  may  not  be 
able  to  clearly  spot  the  differences  between  frequency  and 
bearing  estimates  for  two  different  targets,  and  it  was  feared 
that  a  non -over lapping  algorithm  would  chain  the  two  different 
data  sets  into  one  cluster  at  the  point  of  Intersection  between 
the  two  data  sets.  Clearly  if  cluster  analysis  was  to  sort 
multiple  target  data  into  clusters  for  each  separate  target, 
one  would  not  like  the  two  data  sets  to  be  combined  into  one 
cluster  and  then  try  to  use  this  cluster  as  data  from  a  single 
target .  It  was  hoped  that  this  non-overlapping  technique  would 
be  useful  for  preventing  these  ambiguities  from  confusing  the 
cluster  analysis  program.  However,  after  closely  examining  the 
performance  of  the  algorithm,  it  was  decided  to  continue 
usage  of  non -over 1 app ing ,  single  linkage  clustering  algorithms 
for  the  following  reasons: 


luii  ,H§s piisd  Scfancss 


1)  The  B^  algorithm  was  £ound  to  be  much 
slower  computationally  than  the  non¬ 
overlapping  algorithms , 

2)  It  was  £elt  that  i£  sufficient  automatic 
cluster  extraction  rules  could  be  developed 
for  a  non -overlapping  algorithm,  then  data 
from  the  two  clusters  could  be  sorted  before 
the  data  sets  merged  together.  This  would 
prevent  the  need  for  an  overlapping 
algorithm,  and 

3)  Finally,  it  was  felt  that  automatic  cluster 
extraction  rules  could  be  perfected  for  at 
least  one  of  the  clustering  algorithms. 
Since  Ling  had  already  looked  at  some 
aspects  of  this  problem  for  his  (l,r) 
algorithm,  it  was  decided  to  expand  upon 
this  work  and  concentrate  on  using  the  (l,r) 
algorithm  to  automate  the  cluster  extraction 
process. 

2.3  Selection  of  Ling's  (l.r)  Clustering  Algorithm 

After  eliminating  Moody  and  Jardine's  B 
overlapping  clustering  algorithm  from  further  consideration, 
two  non-overlapping,  single  linkage  clustering  algorithms  were 
left.  One  of  the  candidate  algorithms  was  the  CLUSTAR  single 
linkage  algorithm  (Reference  10)  that  was  used  to  produce  the 
clustering  tree  diagrams  found  in  last  year's  report.  This 
algorithm  was  found  to  perform  well,  but  the  dendrogram  outputs 
were  not  considered  to  be  very  useful  for  automatic  cluster 
extractions.  Having  had  the  advantage  of  knowing  in  advance 


Iicsr  AppNod  Sdincts 


what  data  clusters  to  look  for,  it  was  possible  to  find  the 
desired  clusters  in  the  tree  diagram  outputs.  Without  this  a 
priori  knowledge,  it  was  doubtful  that  the  correct  data 
clusters  could  have  been  correctly  picked  from  the  output. 
Clearly,  either  CLUSTAR's  output  had  to  be  modified  to  make  it 
easier  to  obtain  the  clustered  points ,  or  another  algorithm  had 
to  be  found  which  performed  as  well  but  whose  output  could  be 
easier  to  use  in  the  cluster  extraction  process. 

The  second  candidate  clustering  algorithm 
available  was  Ling's  (k,r)  algorithm  (References  11  and  12). 
This  algorithm  requires  at  least  k  points  to  be  within  some 
distance  r  in  similarity  before  the  data  points  can  be  grouped 
together  into  a  cluster.  For  k  ■  1,  Ling  points  out  that  his 
(l,r)  algorithm  is  Identical  to  a  single-linkage  clustering 
algorithm.  Since  CLUSTAR's  single-linkage  algorithm  had 
performed  so  well,  Ling's  (l,r)  algorithm  was  compared  to  it  to 
see  if  the  (l,r)  algorithm  could  be  used.  Indeed,  Ling's  (l,r) 
algorithm  output  the  same  results ,  but  it  had  a  few  advantages 
which  made  it  preferable  to  CLUSTAR's  algorithm.  First,  the 
clustering  results  output  by  the  algorithm  provided  a  set  of 
data  points  for  each  new  cluster,  not  clustering  tree  diagrams 
as  were  seen  before .  Second ,  Ling  had  proposed  two  different 
decision  making  rules  that  could  be  used  with  his  (l,r) 
algorithm  to  aid  in  the  automatic  extraction  of  isolated 
clusters.  Since  we  have  sought  to  automate  the  cluster 
extraction  process,  it  was  hoped  that  these  decision  rules 
would  be  beneficial  in  picking  the  isolated  clusters. 
Furthermore,  both  algorithms  took  about  the  same  amount  of 
processing  time  to  sort  the  data,  so  neither  could  be  penalized 
due  to  their  respective  computing  cost .  Since  both  algorithms 
produce  the  same  results  for  about  the  same  cost,  it  was 
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decided  Co  use  Ling's  (l,r)  algorithm  because  the  output  was 
easier  to  use  and  because  some  decision  making  rules  were 
available  to  aid  in  automating  the  cluster  extraction  process. 

2.4  Data  Normalization 


It  should  be  recalled  from  last  year's  report 
that  the  raw  attribute  data  for  each  measurement  had  to  be 
normalized  in  some  fashion  to  give  each  attribute  roughly  the 
same  range  of  numerical  values  before  the  resemblance  matrix 
could  be  generated.  For  instance,  360  second  trajectories  have 
been  used  to  generate  the  multi~target  data,  so  the  raw  time 
units  vary  from  0  to  360.  Over  such  a  short  time  interval  and 
for  the  center  frequency  values  and  geometries  used  in  our 
simulations ,  the  Doppler  shifted  frequency  measurements 
typically  vary  by  only  0.1  to  0.2  Hz.  The  bearings  typically 
change  by  at  most  20®  to  30°,  so  the  cosine  and  sine  of  the 
bearing  measurements  vary  by  only  a  few  tenths.  Unfortunately, 
this  large  variation  in  time  values  as  compared  to  the  change 
in  the  frequency  and  the  cosine  and  sine  of  the  bearing 
estimates  caused  the  clustering  algorithm  severe  problems  when 
the  raw  attributes  are  used.  When  Euclidean  distances  were 
used  to  generate  a  resemblance  matrix  for  the  data,  the  large 
difference  in  time  units  dominated  the  dissimilarity 
coefficient  while  differences  in  the  other  attributes  were 
virtually  ignored.  Because  of  this  problem,  the  raw  attribute 
measurements  must  somehow  be  scaled  to  provide  the  same  order 
of  magnitue  of  differences  between  sample  measurements  for  each 
of  the  attributes. 

Last  year,  experiments  were  run  to  test  the  five 
different  data  standardization  techniques  available  in  CLUSTER 
to  find  the  single  one  that  best  fit  our  problem.  The  one 
which  was  finally  picked  scaled  each  of  the  attributes  so  that 
their  respective  range  of  values  fell  between  0  and  1. 
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Specifically,  in  last  year's  report,  each  of  the  attributes 
were  scaled  by  the  following  transformation: 


where : 


min  (X .  .  ) 


max  (X. . )  -  min  (X. . ) 

j  J  i  J 


subscript  i  refers  to  the  1th  sample 
subscript  j  refers  to  the  jth  attribute 
max  (Xjj)  is  the  maximum  value  over  all  i 
samples  of  the  jth  attribute 
min  (Xjj)  is  the  minimum  value  over  all  i 
samples  of  the  jth  attribute. 

This  normalization  process  worked  adequately  for  our 
investigation,  but  one  big  problem  was  encountered  with  this 
technique.  The  maximum  and  minimum  values  used  for 
normalization  varied  with  each  data  set,  so  no  two  data  sets 
were  normalized  in  the  same  fashion.  This  was  felt  to  cause 
large  fluctuations  in  the  values  for  the  dissimilarity 
coefficients  for  the  different  data  sets  found  in  last  years 
clustering  tree  diagrams.  Due  at  least  in  part  to  this 
variation  in  the  data  normalization  process,  the  cutoff 
threshold  for  halting  the  clustering  process  varied 
substantially  from  one  data  set  to  another.  Because  of  these 
problems ,  a  standard  data  normalization  procedure  was 
Instituted  in  this  year's  investigation.  The  following 
procedure  has  been  used  to  normalize  the  input  attribute  data 
for  each  measurement  set: 


1)  _ t  -  t/120 

2)  cosS  ■  5*  cosB 
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3)  sinB  ■  5*  sinB 

4)  F  »  10*  f 


This  procedure  yields  a  consistent  technique  for  scaling  all 
the  attributes  for  every  problem.  Furthermore,  for  the 
scenarios  investigated  here,  this  normalization  procedure 
allows  each  attribute  to  vary  by  about  two  or  three  units. 
Tests  have  shown  the  data  normalization  procedure  described 
above  works  very  well,  so  this  technique  was  used  to  normalize 
all  data  simulated  in  this  study. 
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3.0  AUTOMATIC  CLUSTER  EXTRACTION 

After  Ling's  (l,r)  algorithm  sorts  the  data  into 
clusters,  some  way  is  needed  to  pick  the  best  clusters  out  of 
the  set.  Last  year  when  the  tree  diagrams  were  used,  the  a 
priori  knowledge  of  how  the  data  should  be  properly  sorted  was 
used  to  pick  the  best  set  of  clusters.  However,  with  Ling's 
(l,r)  algorithm,  we  have  sought  to  automate  the  process  by 
which  the  best  clusters  are  chosen.  Thus,  we  are  not  forced  to 
rely  on  user  Interpretation  to  find  the  best  results  and  can 
make  the  procedure  more  usable  for  real  applications .  In  this 
report,  the  process  of  automating  the  procedure  for  picking  the 
best  clusters  is  referred  to  as  automatic  cluster  extraction. 
First  isolated  clusters  are  identified,  and  then  two  tests  are 
employed  to  cull  both  noise  clusters  and  internally 
Inconsistent  clusters.  Following  is  a  detailed  discussion  of 
this  automatic  cluster  extraction  process. 

3.1  Single  Linkage  Clustering  and  Cluster  Formations 

A  clustering  procedure  is  said  to  be  hierarchical 
if,  for  any  given  pair  of  clusters  produced  by  the  procedure, 
either  one  cluster  is  a  subset  of  the  other  or  they  are 
disjoint.  Thus,  a  hierarchical  clustering  procedure  produces  a 
nested  structure  that  can  be  represented  by  a  tree  diagram  or  a 
dendrogram. 


Single -linkage  clustering  is  a  hierarchical 
procedure  that  is  based  on  the  "nearest  neighbor"  concept .  A 
brief  description  of  single  linkage  clustering  follows.  Given 
a  set  of  objects 
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and  a  distance  d  that  is  defined  for  each  object  pair  a(0^, 
Oj),  the  single  linkage-clustering  procedure  first  determines 
the  closest  pair  of  objects  in  the  set  and  then  joins  them  to 
form  a  cluster  C.  By  defining  the  distance  between  cluster  C 
and  any  object  0k  to  be 

d(C,  0k)  -  min  |d(0jL,  0k> :  e  cj 

and  forming  a  new  object  set  S  that  consists  of  C  and  all 
points  of  SQ  not  contained  in  C  (i.e.  -  |cj  U  |sQ/cj  )  , 

the  above  process  can  be  repeated  using  in  place  of  SQ. 
If  the  distance  between  two  clusters  is  defined  to  be 

d(c'  ,  c")  -  min  {d (0± ,0j )  :  0±  £C*  ,0j  eC"j 


then  the  above  procedure  can  be  repeated  for  sets  S2,  S3, 

---  until  all  objects  are  grouped  into  one  large  cluster. 

Single  linkage  clustering  is  a  nonmetric 

procedure  in  the  following  sense.  If  the  original  distances 
d(0^,  Oj)  are  replaced  by  their  rank  orderings  (i.e.,  the 
smallest  is  replaced  by  1,  the  next  smallest  by  2,  etc.),  and 
the  above  algorithm  is  applied  to  the  set  Sq,  then  the  sets 

S^,  S2,  S^,...  will  be  unchanged  (i.e.,  the  same 

collection  of  clusters  will  be  produced).  Thus,  each  distance 
d  on  the  object  set  Sq  can  be  replaced  by  a  symmetric  nxn 
matrix,  M.  ,  having  integral  entries  ranging  from  1 

Q  9  S 

(denoting  the  distance  between  the  closest  pair  of  objects)  to 
n(n-l)/2  (denoting  the  distance  between  the  two  objects  that 
are  farthest  apart).  Figure  3.1  illustrates  the  concepts 
discussed  above.  The  fact  that  each  pair  (SQ,  d)  can  be 
replaced  by  the  pair  (SQ,  g  )  results  in  a  great 
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So  =  {  O  i ,  O2  >  0  3} 


Object  Combinations 


Distance 


Rank  Ordering 
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simplification  since  for  SQ  -  ,  O2 ,  •••,  0nj,  there 

are  only  (n(n-l)/2)!  distinct  possible  choices  for  . 

This  fact  will  be  exploited  when  measures  of  cluster 
significance  are  considered  since  the  output  from  a  clustering 
algorithm  consists  of  a  large  number  of  clusters,  some  of  which 
are  contained  in,  and  some  of  which  contain,  the  target  data 
sets  sought. 


3.2  The  Isolation  Index 

When  performing  cluster  analysis  on  a  set  of 
objects,  each  cluster  is  formed  at  a  particular  clustering 
level.  For  single  linkage  clustering,  these  levels  can  be 
taken  to  be  integers  beginning  with  1.  Formally,  the  single 
linkage  clustering  level  r  of  a  cluster  C  is  defined  by 

r  ■  min  {k:  C  e  S^f. 

Some  authors  refer  to  r  as  the  "time  of  birth"  of  cluster  C. 
If  the  sets  ,  S2 ,  S3,  ....  are  called  clustering  sets, 

then  the  clustering  level  of  a  cluster  C  is  just  the  index  of 
the  clustering  set  in  which  C  first  appears.  The  index  k  of 
set  S^  will  be  referred  to  as  the  clustering  level  k. 

After  a  cluster  has  first  formed  at  level  r,  it 
may  remain  unaltered  for  many  future  clustering  levels. 
However,  when  C  is  not  equal  to  the  entire  set  of  objects,  it 
follows  that  eventually  C  will  be  augmented  to  form  a  new 
cluster  C'  .  A  cluster  that  remains  unchanged  for  a  number  of 
levels  is  said  to  be  well  isolated.  The  following  definition 
quantifies  an  isolation  index  measure  possessed  by  cluster  C: 

i(C)  -  r'  -  r. 


1 
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where  r'  is  the  clustering  level  of  the  augmented  cluster  C' 
that  is  described  above.  The  clustering  level  r'  can  be 
thought  of  as  the  "time  of  death"  of  cluster  C.  In  Figure  3.1, 
cluster  C,  is  "born"  at  level  1  and  "dies"  at  level  2. 
Therefore,  i(C^)  ■  2  -  1  -  1. 

3.3  Isolation  Index  Distribution  -  Survival  Function 


Recall  that  the  matrix  g  determines  the 
clusters  that  are  produced  by  a  single -linkage  clustering 
procedure.  By  choosing  a  null  probability  model  on  g  , 
it  is  possible  to  use  probabalistic  techniques  to  assess  tfie 
significance  of  single-linkage  clusters.  The  simplest  and  most 
conservative  model  to  impose  on  the  space  of  all  possible 


random  matrices  M 


d,s. 


is  to  assume  that  each  matrix  is 


equiprobable .  Under  this  assumption,  the  mathematics  are 
tractable  and  the  statistical  theory  resulting  from  the  use  of 
this  model  can  be  interpreted  as  establishing  bounds  for  other 
models  (i.e.  ,  a  cluster  that  is  judged  to  be  statistically 
insignificant  according  to  this  model  is  unlikely  to  be 
significant  under  any  other  model) .  Under  the  equiprobable 
assumption,  the  isolation  index  of  a  cluster  of  size  1  becomes 
a  random  quantity  whose  distribution  can  be  calculated. 
Specifically,  if  C  is  a  cluster  of  size  j  with  clustering  level 
r ,  then 


Prob  [i  (c)  -  s] 


r  -  j  (n-j) 
s-1 


N-r 

s 


where  N  ■  (2)*  The  probability  of  the  event  |i(C)  =  sf  being 
small  implies  that  the  chance  of  a  cluster  with  this  isolation 
index  appearing  simply  by  accident  or  at  random  is  small. 
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Thus,  a  good  assumption  is  that  these  data  "belong  together," 
i.e.  came  from  the  same  target.  Therefore,  the  survival 
function  is  used  to  help  select  the  set  of  candidate  clusters; 
a  probability  threshold  is  set  and  those  clusters  with  survival 
function  values  falling  below  the  threshold  are  passed  on  for 
further  testing. 

3.4  Polarization  Test  and  Cluster  Extraction 


Once  the  isolation  index  has  been  used  to 
identify  clusters  of  data  that  possess  greater  than  expected 
similarities,  these  clusters  must  then  be  analyzed  to  determine 
their  randomness  and  internal  consistency.  Two  primary  reasons 
for  doing  this  take  advantage  of  the  way  in  which  single 
linkage  clustering  joins  data  together  (see  section  3.1): 

1)  Random  noise  can  appear  as  an  isolated 
cluster,  either  when  targets  are  present  or 
when  just  random  noise  is  clustered,  and 

2)  As  larger  and  larger  clusters  are  formed 

from  the  data,  isolated  clusters  can  become 
subsets  of  larger  isolated  clusters. 

Eventually,  there  may  be  several  isolated 

clusters  contained  in  one  large  isolated 

cluster,  and  they  may  or  may  not  be 
observations  for  the  same  target. 

This  section  discusses  the  test  for  randomness 

employed  by  the  MTTA.  To  motivate  the  selection  of  this  test, 
imagine  a  cluster  of  data  triples  (time,  frequency,  and 
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bearing)  projected  onto  the  frequency-bearing  plane,  (Figure 
3.2  provides  several  examples).  Then, 

1)  Figure  3. 2. a  represents  random  noise, 
frequency  and  bearing  estimates  are 
scattered  randomly  about , 

2)  Figure  3.2.b  represents  data  from  a  target 

showing  significant  bearing  change  but 
little  Doppler  change, 

3)  Figure  3.2.c  represents  data  for  a  target 

that  shows  little  bearing  change  but 
substantial  frequency  change,  possibly 
approaching  CPA  with  the  sonobuoy,  and 

4)  Figure  3.2.d  represents  data  from  a  target 

with  little  bearing  change  or  frequency 
change.  This  is  a  typical  plot  for  a  target 
in  which  a  large  amount  of  data  has  been 
gathered  over  a  short  time  interval. 

If  a  grid  is  imposed  on  the  plane,  then  the 

number  of  observations  occurring  in  a  cell  can  be  tallied  and 
the  number  of  observations  occurring  in  a  particular  row  or 
column  can  also  be  tallied.  Then  for  clusters  of  random  noise 
(as  in  Figure  3. 2. a),  the  row  totals  should  be  roughly  equal 
and  the  column  totals  should  all  be  roughly  equal  while  for 
target  clusters,  such  as  those  displayed  in  Figures  3.2.b  and 
c,  either  the  row  or  column  totals  (or  both)  will  display  very 
distinct  departures  from  equality.  If  a  multinomial 
distribution  is  assumed  for  these  row  and  column  totals,  then 
Alam  and  Mitra  (Reference  4)  have  devised  a  test  to  determine 
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if  the  mass  of  the  points  is  spread  out  evenly  over  the  row  and 
column  totals  or  if  it  is  concentrated ,  or  polarized,  in  just  a 
few  of  the  rows  and  columns.  A  brief  explanation  of  the  test 
follows. 


the 


vectors 


j  -  1.  2, 


A  partial  ordering  of 

(p^,...,pn)  is  defined  by  the  relation: 

r>  **  ifApa)  \tp'(i> 

where  P(i)  is  the  ith  largest  value  among  p^,..*,pk. 
Then,  a  multinomial  distribution  with  probability  vector  is 
said  to  be  less  polarized  than  the  multinomial  distribution 
with  probability  vector  ^  if  .  Note  that 


(e*  k . e)  <  (c=r  *  CT . °)  <  (1.  0.  0 . 


0) 


5r^  ■  n  be  the  k  row 


Let  R  -  (r^ , . . .  ,rk)  , 

totals  associated  with  a  particular  cluster  and  consider  the 
following  test: 


Hq:  ST’ ■  it*  against  the  alternative 


H1:  SF 


where  R'  is  the  equally  likely  case.  Alam  and  Mitra  consider 
the  test  statistic 


T(i$  - 


rj/n 


35 
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and  show  that  Hq  is  rejected  for  large  values  of  T.  The 
critical  region  of  the  test  is 

T$5  >  C(  where 
P|T(1$  >  C  |  f-f  |  ■  a. 

For  •  (1/k,  1/k,..., 1/k)  Alam  and  Mitra  show  that 


T*  -  kT  (^)  -  n 

2 

is  asymptotically  distributed  as  x k-1. 
also  be  conducted  for  the  column  totals* 


A  similar  test  can 


In  applying  this  test,  the  grid  size  selected  was 
25  by  25,  with  bearing  running  from  0  to  2rr  radians  and 
frequency  from  148.8  to  151.2  Hz.  In  their  paper,  Alam  and 
Mitra  briefly  discuss  the  convergence  properties  of  the  test 
and  show  that  n  must  be  fairly  large  before  the  x^k-1 
approximation  is  very  good.  This  was  observed  in  the  current 
study,  for  large  clusters  (n  >.  60)  testing  at  the  .995 
percentage  point  of  the  X24  distribution  was  very 
effective  at  distinguishing  noise  clusters  from  data  clusters. 
However,  for  clusters  of  size  less  than  forty,  the  test  still 
discriminated  very  well  but  the  X ^  distribution  did  not 
provide  a  good  critical  value.  Clusters  composed  mostly  of 
observational  data  had  high  values  of  T*  while  clusters 
composed  mostly  of  noise  had  much  lower  values  of  T*.  However, 
the  values  of  T*  found  for  noise  clusters  were  higher  than  the 
cut-off  points  supplied  by  the  X|^  distribution. 
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At  this  point  there  were  two  basic  choices: 

1)  Compute  the  conditional  distribution 
function  CDF  of  T(R^  directly  using  the 
recursion  formula  presented  by  Alam  and 
Mitra,  or 

2)  Determine  if  some  simple,  "rule-of-thumb" 
existed  which  would  allow  reasonable 
decisions  to  be  made. 

The  second  alternative  was  chosen  because: 

1)  Implementation  of  the  Alam  and  Mitra  CDF 
algorithm  would  have  been  a  formidable 
programming  task,  particularly  for  n  >  5, 

2)  Including  the  calculation  as  a  subroutine 
would  have  imposed  a  substantial  computation 
penalty,  and 

3)  A  simple  and  reasonable  "rule-of-thumb" 
value  was  readily  available. 

Primary  considerations  involved  in  selecting  a  cut-off  value 
were: 

1)  It  should  not  reject  Hq  for  very  small 

clusters  (n  £  5)  no  matter  how  concentrated 
its  members  are, 
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2)  It  should  reject  Hq  for  small  clusters  (5 
*  n  * 15)  only  if  most  of  their  members  are 
concentrated  in  one  or  two  cells ,  and 

3)  It  should  reject  Hq  for  moderate  sized 
clusters  (15  <  N  <40)  only  if  a  significant 
number  (40Z  or  better)  of  their  members  are 
concentrated  in  one  or  two  cells. 

Figure  3.3  contains  an  example  of  several  graphs 
that  were  constructed  for  various  values  of  T*.  Simulation 
results  had  shown  that  for  small  time  periods  (<5  minutes) , 
clustered  target  bearing  and  frequency  data  would  usually  be 
confined  to  two  or  three  cells,  with  some  noise  points  possibly 
scattered  about  in  other  cells.  Thus,  curves  were  constructed 
for  T*  as  a  function  of  x^  X2»  x^  where: 


1) 

x^  was  the  number  of  points  in  cell  i. 

2) 

x^  ranged  from  0  to  n. 

3) 

X2  ranged  from  0  to  n-x^. 

4) 

x^  ranged  from  0  to  n-x^-X2»  and 

5) 

the  remaining  n-x^-^-x^  points 

considered  to  be  placed  one  to  a  cell. 

were 

Clearly,  the 

value  of  T*  is  invariant  to  which  cells 

contain 

the  points , 

so 

that  the  choice  of  x^,  X2,  x^ 

is  not 

important . 


The  dashed  lines  of  Figure  3.3  indicate  the  two 
cutoff  points,  the  leftmost  line  corresponds  to  the  .995 
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percentage  point  of  the  X24  distribution  and  the  rightmost 
line  corresponds  to  twice  this  number.  Keeping  in  mind  that 
the  hypothesis  being  tested  is 

Hq:  ■  (1/k,  . ..,  1/k)  versus 

Hx :  $*>  , 

it  is  clear  that  the  rule-of-thumb  value  requires  substantially 

higher  cut-off  values  than  does  the  approximation  value. 

Additionally,  the  rule-of-thumb  value  will  not  reject  Hq  for 

n  <  5  and  requires  almost  total  concentration  in  a  single  cell 

for  n  near  5.  For  n  >  10,  concentrations  in  the  40%  to  60 % 

range  are  required  for  single  cell  polarization  and  in  the  80% 

-  90%  range  for  two  and  three  cell  polarizations.  The 
2 

comparable  X24  approximation  values  are  at  least  20%  -  30% 
below  this.  Using  the  rightmost  line  as  the  cut-off  has  worked 
quite  well  in  this  study.  In  order  for  a  cluster  to  be 
declared  "random"  it  must  accept  Hq  for  both  the  row  and 
column  totals. 


3.5  Internal  Consistency  and  Cluster  Extraction 

This  section  discusses  the  final  stage  in  the 
cluster  extraction  process,  the  test  for  internal  consistency. 
Figure  3.4  illustrates  the  logic  flow  for  the  procedure. 

At  this  point  in  the  extraction  process,  the 
clusters  under  consideration  are  assumed  to  be  well-isolated, 
non-random  groups  of  data.  What  remains  is  to  examine  for 
consistency  each  cluster  that  contains  two  or  more  isolated 
clusters  as  subsets.  Basically,  the  idea  is  to  determine 
whether  the  information  contained  in  all  the  Isolated 
subclusters  is  compatible  or  not. 


Y 
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3.5.1 


Regression  and  Internal  Consistency 


One  of  the  primary  difficulties  in  finding  an 
internal  consistency  test  for  isolated  clusters  which  contain 
other  isolated  clusters  as  subsets  is  that  the  frequency  and 
bearing  estimates  are  functions  of  time  within  these  clusters. 
Typically,  these  isolated  subclusters  correspond  to  time 
segments  of  observations  for  one  or  more  targets;  they  start  as 
small  groups  of  two  or  three  points  and  then  chain  outward 
until  the  cluster  under  consideration  is  formed.  Under  these 
circumstances,  usual  measures  of  similarity  such  as  means, 
ranges,  modes,  etc.,  can  be  very  misleading.  For  a  particular 
target,  the  average  frequency,  bearing  sine,  or  bearing  cosine 
from  the  first  part  of  an  observation  period  can  be  quite 
different  from  the  average  computed  from  the  last  part  of  the 
observation  period.  Based  on  these  considerations,  regression 
techniques  were  used  to  develop  the  internal  consistency  test. 


When  frequency,  bearing  sine,  and  bearing  cosine 
for  a  particular  target  are  plotted  against  time  for  short 
intervals  (less  than  six  minutes) ,  they  are  nearly  linear 
functions  and  can  be  approximated  fairly  well  by  straight 
lines.  Measurements  from  two  different  and  distinguishable 
targets  define  different  straight  lines ,  and  this  forms  the 
basis  for  the  use  of  single “linkage  clustering  techniques. 
However,  as  was  shown  in  Section  3.1,  the  single “linkage 
algorithm  creates  ever  larger  clusters  by  joining  points  or 
clusters  to  already  existing  clusters.  If  clusters  are  joined 
from  two  different  and  distinguishable  targets,  then  the 
problem  of  internal  consistency  can  be  cast  as  a  regression 
problem  to  determine  whether  a  set  of  a  data  defines  one  or  two 
regression  lines.  The  lines  could  differ  by  a  change  in 
Intercept,  a  change  in  slope,  or  by  both. 
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Neter  and 

procedure  for  doing  this . 
is : 


Vasserman  (Reference  6)  give  a 
The  basic  formulation  of  the  model 


1)  Let  Y^  be  the  measurement  under 
examination  (frequency,  bearing  sine,  or 
bearing  cosine)  and  let  be  the  time  of 
the  measurement , 


2)  Let  be  an  indicator  variable  attached 

to  Y^  such  that  1^  ■  0  if  Y^  comes 

from  the  first  cluster  and  1^  -  1  if  Y^ 

comes  from  the  second  cluster,  and 


3)  The  model  becomes 


Y.  =  B„  +  B,T.  +  B9I.  +  B~T.*I .  +  €. 

X  O  10.  £  X  3  X  X  X 


and  the  response  (regression  function)  is 


E(Y)  -  Bq  +  B^T  +  B2I  +  B3T*I. 

To  understand  the  function  of  the  indicator  variable  in  this 
model,  note  that  for  measurements  from  the  first  cluster  (I»0)  , 


E(Y)  -  Bq  +  BjT  +  B2  (0)  +  B3(0) 
■  B  +  B,T 

O  1 


and  for  measurements 


from  the  second  cluster  (1*1) , 


E(Y)  -  Bq  +  B^  +  B2(l)  +  B3(1)(T) 
(Bo  +  B2)  +  (Bx  +  B3)T. 


A  ^ 
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Thus,  Bj  measures  the  difference  in  intercept  estimates 
between  the  two  lines  and  measures  the  differences  in 
slope  estimates  between  the  two  lines.  Therefore,  the  test  for 
equality  of  the  two  regression  lines  corresponds  to  a  test  of 
the  hypothesis 


where 


Hq:  B2  m  Bg  ■  0  versus 
H^:  B2  or  Bg  ¥  0. 


The  test  statistic  is 


SS  (B2.  B  3 1  Bq  ,  BO 
2 


SSE 

ni  +  n2  -  4 


F* 


a 


1)  SS  (B2,B2 |Bq,B1)  is  the  reduction  in 

the  regression  sum  of  squares  due  to  the 
inclusion  of  I,  and  T*I  in  the  model, 

2)  SSE  is  the  error  sum  of  squares  for  the 
model,  and 


3)  n^  is  the  number  of  objects  in  cluster  1 
and  n2  is  number  of  objects  in  cluster  2. 

The  above  procedure  outlines  a  method  for 
determining  whether  a  set  of  measurements  coming  from  two 
different  clusters  is  internally  consistent  or  not.  This  is  J 

done  for  each  of  the  three  measurements  available  i.e. 
frequency,  bearing  sine,  and  bearing  cosine.  A  weighted 
decision  scheme  is  then  used  based  on  the  three  calculated 
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F-values  (F^,  Fgg,  an^  »  to  determine  whether  the  two 

clusters  contain  consistent  information  or  not.  Considerations 
used  to  establish  this  scheme  include: 

1)  Frequency  information  should  be  weighted 
less  than  bearing  sine  and  bearing  cosine 
information.  This  is  done  because  frequency 
is  only  measured  to  the  nearest  tenth  of  a 
Hertz,  so  a  normal  Doppler  shift  of  only  0.1 
Hz  can  indicate  a  spurious  inconsistency, 

2)  The  bearing  sine  (bearing  cosine) 
measurements  from  two  targets  may  be  quite 
close  while  the  bearing  cosine  (bearing 
sine)  measurements  are  substantially 
different.  Thus,  only  one  of  the  bearing 
component  measurements  is  needed  to  indicate 
inconsistency,  and 

3)  When  in  doubt,  pass  a  cluster.  The  result 

of  passing  a  cluster  that  should  not  be 
passed  is  an  increase  in  processing  time  for 
other  areas  of  the  MTTA  algorithms.  The 

result  of  not  passing  a  cluster  that  should 
be  passed  is  the  possible  loss  of  a  target 
track. 

The  weighting  scheme  (to  be  discussed  below) 
depends  on  a  method  developed  by  Suich  and  Derringer, 

(Reference  13)  and  extended  by  Ellerton  (14)  to  examine  the 
significance  of  subsets  of  the  regression  variables. 

Typically,  this  subset  of  variables  describes  some  particular 
characteristic,  such  as  trend  or  curvature,  of  the  response 
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system.  In  our  case,  the  characteristic  being  examined  is  the 
tendency  of  the  data  to  be  best  described  by  two  regression 
lines  instead  of  one. 

A 

„  Let  represent  the  full  regresssion  model 

and  let  Yr(3?)  represent  the  reduced  regression  model,  that 
is,  the  full  model  minus  the  variable  subset  under 
consideration.  The  quantity  to  be  tested,  y,  is  the  ratio  of 
the  average,  squared  expected  differences  in  prediction  of  the 
two  models  to  the  average  variance  of  these  predicted 
differences ; 

_  average  expected  squared  prediction  difference 
average  variance  of  predicted  differences 

The  test  statistic  is  the  usual  F-ratio,  which  in  this  case  is. 


and  the  hypothesis  tested  is 

H  :  y  -  y  versus 
o  o 

H,  :  Y  >  Y  . 

1  o 

Acceptance  of  the  null  hypothesis  indicates  that  either  the 
full  model  should  be  reassessed  as  a  predictor  or  that  the 
error  variation  in  the  data  is  simply  too  large  relative  to  the 

A  -A 

variation  in  predictive  power  of  Y^jc)  and  Yr(j?). 
Essentially,  if  there  are  large  errors  relative  to  the  range 
covered  by  the  data,  then  a  subset  of  variables  may,  under  the 
usual  F-test,  appear  to  be  significant  when  in  fact  they  are 
simply  fitting  the  errors  in  the  data.  The  purpose  of  this 
test  is  to  detect  that  situation. 
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In  the  extraction  algorithm  yq  was  set  equal  to 
2,  and  the  critical  value  then  became  about  four  times  the 
F-value  used  to  make  the  standard  test: 

Hq:  “  B3  ■  0  versus 

H^:  at  least  one  of  B^ ,  B2  ^ 

3.5.2  Weighting  Scheme 

Using  the  regression  analysis  just  described,  the 

interval  consistency  test  reduces  to  the  following,  letting 
Fu  stand  for  the  "usual"  regression  F  critical  value  and  F£ 
stand  for  the  value  generated  by  applying  Ellerton's  work,  the 
weighting  scheme  used  is: 


1)  If  1 

* 

F,  <  F 
f  ul 

then  | 

wf  =  0 

Ff  *  Fu 

< 

WE  *  1 

- 

2)  If( 

BS  U 

WBS  *  0 

Fu  «  fbs  <  fe 

then 

WBS  *  l 

fbs  4  fe 

WBS  =  2 

3)  If 

F  <  F 

*BC  -  u 

WBC  "  0 

Fu  <  FBC  <  FE| 

then 

WBC  =  1 

FBC  2  FE  1 

1 

WBC  =  2 

4)  W  -  Wf  +  WBS  +  WBC ,  and 


5)  If  W  >  2  then  reject  the  hypothesis  that  the  two 
clusters  are  consistent. 
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Basically,  this  scheme  says  that  frequency  variation  alone  is 
not  enough  to  reject  the  idea  of  consistency,  instead  it  must 
reject  in  concert  with  either  bearing  sine  or  cosine.  On  the 
other  hand,  bearing  sine  or  cosine  measurements  are  enough  by 
themselves  to  reject  consistency  if  the  indication  of  multiple 
cluster  data  is  extremely  strong.  Middle  of  the  road  rejection 
values  from  both  bearing  cosine  and  sine  measurements  are  also 
enough  to  reject  consistency.  Because  a  false  rejection  of  the 
consistency  hypothesis  is  much  more  costly  than  a  false 
acceptance,  all  critical  values  were  set  with  a  ■  .005. 

If  a  particular  cluster  possesses  exactly  two 
isolated  subclusters ,  then  the  above  procedure  is  fine  as  it 
stands,  the  cluster  will  be  accepted  or  rejected  based  on  the 
final  value  of  U.  However,  if  a  given  isolated  cluster 
contains  more  than  two  isolated  clusters  as  subsets,  then 
another  procedure  for  rejecting  consistency  is  needed.  Suppose 
the  cluster  in  question  has  k  isolated  subclusters,  then  there 
are  (2)  cluster  pairs  to  be  examined  for  consistency.  If  it  is 
assumed  that  all  clusters  come  from  the  same  track,  then  k-1  of 
the  potential  pairs  should  link  together  or  show  consistency. 
Thus,  for  clusters  with  k  isolated  subclusters  (k  >  2)  ,  the 
consistency  hypothesis  will  be  rejected  if  fewer  than  k-1 
cluster  pairs  pass  the  two  cluster  consistency  test. 
Experience  has  shown  this  to  be  a  fairly  liberal  criterion. 

.6  Automatic  Cluster  Extraction  Summary 
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2)  Polarization  of  observations  in  a  few 
frequency  or  bearing  bins ,  and 

3)  Internal  consistency  among  isolated 
subclusters . 

Cluster  isolation  is  measured  using  Ling's  survival  function. 
Those  clusters  possessing  a  survival  function  value  of  less 
than  10’^  are  considered  well  isolated  and  selected  for 
further  testing.  Once  a  cluster  is  considered  isolated  it  is 
subjected  to  the  Alam  and  Mitra  polarization  test  to  determine 
if  most  of  its  members  are  concentrated  in  a  few  bins  or  not. 
Those  clusters  whose  observations  are  concentrated  in  a  few 
bins  are  determined  to  be  nonrandom  collections  and  are  passed 
on  to  the  final  test.  Clusters  remaining  at  this  point  are 
examined  for  subclusters  which  happen  to  be  isolated  clusters 
also.  Those  clusters  which  contain  no  isolated  subclusters  are 
considered  internally  consistent  and  declared  target  data 
sets.  Clusters  containing  isolated  subclusters  are  given  a 
regression  test  for  internal  consistency.  Those  that  pass  the 
test  are  passed  as  target  data  sets. 

All  the  above  tests  are  conducted  using 
hypothesis  tests  with  alpha  values  set  high  to  allow  marginal 
groups  to  pass  through  and  be  selected.  It  is  felt  that  it  is 
better  to  pick  a  bad  cluster  as  a  potential  target  data  set 
than  reject  a  cluster  that  is  a  target  data  set. 
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4.0  INITIAL  GUESS  PROCEDURE 

Once  the  sonobuoy  data  have  passed  through  the 
cluster  formation  and  cluster  extraction  algorithms,  they  have 
been  grouped  into  sets  of  clusters  for  each  sensor  that 
correspond  to  potential  target  data  sets.  The  next  step  is  to 
solve  the  intersensor  correlation  problem  which  seeks  to 
determine  which  data  clusters  from  each  sensor  contain 
observations  on  the  same  target.  For  this  study,  simultaneous 
observations  for  a  target  were  required  from  either  two  or 
three  sensors  because,  in  general,  a  single  sonobuoy  cannot 
provide  enough  information  with  which  to  track  a  target.  For 
even  a  small  number  of  sensors  and  targets,  the  number  of 
potential  combinations  can  become  fairly  large.  For  example, 
consider  a  3  target  scenario  observed  by  three  sonobuoys  and 
assume  that  each  buoy  hears  each  target  perfectly  and  that  the 
cluster  formation  and  extraction  routines  perfectly  separate 
the  data.  There  are  then  three  data  clusters  associated  with 
each  sensor,  and  these  can  be  combined  into  27  (i.e. ,  3*3  +  3*3 
+  3*3)  potential  two-cluster  intersensor  combinations  and  27 
(3*3*3)  potential  three-cluster  intersensor  combinations. 

To  help  separate  unlikely  combinations  from 
likely  combinations,  an  initial  guess  procedure  and  a  set  of 
guess  evaluation  criteria  were  developed.  This  section 
describes  the  initial  guess  procedure  itself  and  the  following 
section  discusses  the  evaluation  criteria. 

4.1  Crossed-Bearlng ,  Crossed-Frequency  Initial  Guess 

Procedure 


An  initial  guess  procedure  which  used 
crossed -bearing  and  crossed- frequency  information  to  generate 
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initial  conditions  for  target  tracks  was  described  in  last 
year's  report  (Reference  3).  It  was  used  to  generate  a 
reasonable  guess  of  a  target's  position  and  velocity.  This 
guess  was  then  passed  to  the  initializers  of  either  Tracor's 
EKF  or  HTA  so  that  the  tracker  could  more  quickly  converge  onto 
an  acceptable  set  of  initial  conditions  for  the  target.  This 
procedure  used  the  frequency  and  bearing  measurements  from  two 
or  more  sensors  in  a  least  squares  formulation  to  estimate  the 
target's  initial  position  and  velocity.  The  estimates  from 
this  procedure  were  found  to  be  reasonably  good  and  greatly 
improved  the  speed  and  the  initialization  characteristics  of 
Tracor's  two  trackers. 

For  a  bearing  estimate  from  sensor  i,  the 
crossed -bearing  equation  is: 

x  sin  8 j  -  y  cos  8.  ■  x.  sin  8.  -  y.  cos  8. 

I  J  ii  i'i  l 

where 

8^  =  the  bearing  estimate  for  sensor  i, 

(xi,yf)  *  the  position  components  for  sensor  i, 

(x,y)  «  the  position  components  for  the  target. 


Similar  crossed-bearlng  equations  can  also  be  obtained  for  any 
other  observing  sensors.  The  crossed-frequency  formulation 
uses  the  bearing  and  frequency  estimates  from  a  given  sensor  to 
generate  target  velocity  estimates.  The  actual 
crossed -frequency  equation  for  buoy  1  is: 


x  cos  B^  +  y  sin 


TOW 
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where 

6^  -  Bearing  estimate  from  buoy  i, 

■  Doppler-shifted  frequency  estimate 
from  buoy  i, 

fQ  ■  Unshifted  center  frequency 

transmitted  by  the  target, 
c  ■  Speed  of  sound  in  the  water, 

(x,y)  ■  Velocity  components  for  the  target, 

(Xf.yi)  =  Velocity  components  for  the  buoy. 

Once  again,  similar  crossed-frequency  measurement  equations  can 
be  generated  for  other  sensors  that  observe  the  target. 

A  motion  model  is  needed  to  correlate  the  target's 
position  and  velocity  estimates  and  to  provide  a  means  of 
mapping  all  measurements  back  to  an  initial  epoch  so  that  a 
batch-type,  least  squares  estimate  can  be  generated  for  the 
target's  initial  position  and  velocity.  Since  only  a  small 
data  stream  which  covers  a  short  time  span  is  used  in  the 
initial  guess  procedure,  a  linear  motion  model  was  selected  to 
describe  the  target's  trajectory.  The  linear  motion  model 
used  is: 

x  -  xq  +  xQ  At 

y  •  y0  +  *o • 

Following  Bard  (Reference  15) ,  the  crossed -bearing  and 
crossed-frequency  equations  can  be  combined  with  the  motion 
model  in  a  two-equation  least  squares  model  to  estimate  (xq, 

yo»  *o»  y©)* 
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4.2  Measurement  Weighting  Equations 

Last  year,  least  squares  estimates  fur  the 
target's  initial  state  vector  were  generated  by  weighting  all 
the  measurements  equally.  This  resulted  in  good  position 
estimates  but  relatively  poor  velocity  estimates.  It  was  felt 
that  weighting  the  measurements  might  Improve  the  target's 
velocity  estimates. 

Associated  with  each  frequency  and  bearing 
measurement  is  a  standard  deviation  value  which  describes 
statistically  the  accuracy  of  the  measurement.  These  standard 
deviations  are  computed  as  functions  of  the  estimated 
signal -to-noise  ratio  for  the  received  signal.  Typically,  the 
measurement  standard  deviations  are  used  to  compute  a 
measurement  weight  which  enables  the  data  to  be  processed  by  a 
weighted  least -squares  tracking  algorithm.  With  the 
crossed-bearing  and  crossed -frequency  initial  guess  algorithm, 
the  measurement  equations  are  functions  of  the  frequency  and 
bearing  estimates.  Some  method  has  to  be  used  which  computes 
standard  deviations  of  the  crossed-bearing  and  the 
crossed -frequency  measurements  as  functions  of  the  standard 
deviations  computed  for  the  frequency  and  bearing  estimates. 
Following  Young  (Reference  16) ,  a  first  order  Taylor  series 
approximation  is  used  to  compute  the  standard  deviation  for  the 
modified  measurement  equation.  If  we  use  6  to  represent  either 
the  crossed -bearing  or  the  crossed -frequency  measurement 
function  and  f  and  S  to  represent  the  frequency  and  bearing 
estimates,  respectively,  the  following  equation  is  used  to 
compute  the  measurement  weights: 
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In  the  above  formulation,  the  assumption  has  been  made  that  the 
buoy  positions  and  their  velocities  are  perfectly  known,  so  no 
buoy  position  or  velocity  errors  are  allowed  to  propagate 
through  this  standard  deviation  equation.  Specifically  for  the 
bearing  equation,  the  measurement  equation,  G,  is: 


G  ■  sin  -  y^  cos  8^. 

Since  no  buoy  position  errors  are  encountered  and  since  the 
frequency  estimate  does  not  contribute  to  this  measurement 
equation,  the  crossed -bearing  standard  deviation  becomes  a 
function  of  the  bearing  estimate  only 


tG  “  (xi  cos  S±  +  yt  sin  Bi) 


2  2 
On 


Conversely,  the  crossed-frequency  measurement  equation 
function  of  both  the  bearing  and  frequency  estimates 


G 


fo  A  +  *i  cos  Si  +  y±  sin  8^  _1 
f±  \  c  / 


is 


a 


To  obtain  the  crossed-frequency  standard  deviation,  it  is 
assumed  that  no  errors  are  encountered  in  obtaining  buoy  state 
values  so  that  they  do  not  contribute  to  the  sources  for  the 
crossed  frequency  measurements: 


c  -  speed  of  sound  in  the  water, 

(xt ,  y^)  -  velocity  components  of  buoy  i, 

f  -  unshifted  center  frequency  value  for  the 
transmitted  target  signal. 
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f^  -  Doppler  shifted  frequency  received  by  the 
sonobuoy . 

The  crossed -frequency  equation  for  sigma  then  reduces  to: 


Using  the  weights  generated  by  these  equations  in  the 
estimation  procedure  has  greatly  improved  the  velocity 
estimates  without  affecting  the  position  estimates  from  the 
MTTA's  IG  algorithm.  The  XG  algorithm  now  in  use  in  the  MTTA, 
therefore,  is  a  weighted  least-squares  crossed-bearing , 
crossed -frequency  algorithm. 
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5.0  PHYSICAL  AND  STATISTICAL  CONSTRAINTS  ON  THE 

INITIAL  GUESS  ESTIMATES 

After  initial  guess  estimates  are  generated  for 
each  of  the  possible  intersensor  cluster  combinations,  these 
estimates  must  be  examined  to  determine  which  ones  provide 
initial  conditions  which  are  plausible  in  real  world 
encounters.  As  stated  previously,  there  are  twenty-seven 
possible  two-cluster  intersensor  combinations,  and 

twenty-seven,  three-cluster  intersensor  combinations  for  the 
three-target,  three-sensor  scenario  shown  in  Figure  7.1. 
Figure  5.1  shows  all  the  possible  bearing  line-of -sight 
intersections  that  would  result  from  combining  bearings  from 
two  sensors  in  the  crossed-bearings  procedure.  Notice  the 
large  number  of  intersections  that  result  from  this  simple 
example.  Although  it  is  not  as  easy  to  illustrate,  a  large 
number  of  possible  Intersections  also  result  from  using  the 
crossed-bearing  procedure  for  all  possible  three-cluster, 
intersensor  combinations.  This  section  examines  ways  to  use 
physical  and  statistical  constraints  to  eliminate  the  unlikely 
combinations  produced  by  the  initial  guess  procedure. 

5.1  Physical  Constraints  Test 


The  first  step  in  eliminating  implausible  initial 
guess  values  is  to  examine  the  estimates  in  terms  of  sonobuoy 
detection  limits  and  maximum  submerged  target  performance 
levels.  These  limits  are  referred  to  as  the  physical 
constraints  on  the  problem.  These  constraints  have  been  left 
as  user  defined  inputs  so  that  they  may  be  varied  for  the 
different  classes  of  sonobuoys  or  the  different  types  of 
targets  that  might  be  encountered  in  real  engagements. 
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The  first  physical  constraint  fixes  a  maximum 
detection  range  for  the  sonobuoys  to  eliminate  candidate 
initial  guess  estimates  that  fall  well  outside  the  detection 
range  of  the  sonobuoy  pattern.  In  real  applications,  an 
operator  uses  parameters  such  as  the  class  of  sonobuoys  used, 
the  minimum  detectable  signal  level  (MDL)  for  these  sonobuoys, 
ambient  conditions  such  as  sea  state,  and  other  factors  to  set 
a  maximum  range  for  the  day  for  the  sonobuoys .  For  our 
simulation,  we  used  a  transmitted  signal-to-noise  ratio  of  80 
dB  in  a  1  Hz  band  at  a  distance  1  yard  from  the  source  to 
simulate  each  of  the  target's  signal  sources.  Furthermore,  the 
MDL  used  for  our  simulation  was  set  to  a  reasonable  value  of  0 
dB  in  a  1  Hz  band  (this  gives  +10  dB  in  a  0.1  Hz  band).  Given 
this  80  dB  source  level,  a  0  dB  MDL,  and  assuming  a  20  log  R 
signal  propagation  loss  through  the  water,  the  maximum 
detectable  range  for  each  of  the  sonobuoys  would  be  10,000 
yds.  To  insure  a  sizable  safety  margin,  a  50%  error  tolerance 
was  added  to  this  10,000  yd  range,  so  the  maximum  allowable 
range  was  set  at  15,000  yds.  Any  initial  guess  estimate  whose 
position  values  would  cause  the  target  to  fall  outside  this 
15,000  yd  range  for  any  of  the  observing  sensors  was  then 
rejected  and  was  never  considered  again  in  the  MTTA. 

The  second  physical  constraint  restricted  the 
target's  speed  to  some  maximum  value  which  once  again  was  set 
by  the  program  user.  For  actual  applications,  operators 
usually  know  what  the  maximum  submerged  speeds  are  for  certain 
classes  of  targets.  If  not,  the  operator  could  set  the  upper 
limit  to  the  maximum  known  value  for  any  class  of  targets.  For 
this  simulation,  the  maximum  target  speed  was  set  to  10  m/sec, 
which  corresponds  to  roughly  20  knots.  Once  again,  a  50%  error 
tolerance  was  added  to  this  value  so  that  the  maximum  allowed 
speed  used  by  the  program  was  15  m/sec.  This  tolerance  was 
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much  larger  than  the  speed  of  the  fastest  target  in  the 
scenario  which  was  9  m/sec,  so  this  upper  limit  was  considered 
to  be  reasonable  for  our  experiments . 

With  these  physical  constraints,  several  of  the 
initial  guess  estimates  could  be  rejected,  so  their 
corresponding  intersensor  combinations  could  also  be  eliminated 
from  further  consideration.  However,  many  more  combinations 
passed  these  two  tests  than  were  rejected,  so  many  of  the 
combinations  were  still  under  active  consideration  at  this 
point.  If  one  looks  at  Figure  5.1,  one  sees  that  most  of  the 
two-cluster  intersensor  combinations  fall  well  within  the 
15,000  yd  range  of  their  participating  sonobuoys .  While  the 
target  speed  constraint  helped  reject  a  few  of  the  remaining 
combinations,  most  combinations  were  still  left  intact  after 
this  test  also.  Since  so  many  combinations  were  left,  it  was 
decided  to  perform  some  statistical  tests  on  the  remaining 
cluster  combinations  to  examine  the  quality  of  the  initial 
guess  estimates,  and  to  then  eliminate  all  estimates  that  were 
found  to  fit  the  data  poorly  in  a  statistical  sense. 

5.2  Two  Sensor  Statistical  Compatabllity  Test 

After  a  particular  two  sensor  initial  guess 
estimate  has  passed  the  physical  constraints  test,  it  is  then 
subjected  to  a  statistical  test  that  measures  the  consistency 
of  the  information  contained  in  the  contributing  clusters. 
This  test  is  similar  to  the  test  used  to  determine  cluster 
consistency  and  is  based  on  the  use  of  indicator  variables. 

From  section  4.2,  it  will  be  recalled  that  the 
initial  guess  procedure  uses  a  weighted  least  squares  method  to 
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estimate  the  target's  initial  state  vector  (xQ,  yQ, 
yQ).  In  doing  this,  it  fits  a  model  of  the  form 


Mi  "  BlZil  +  B2Zi2  +  b3  Zi3  +  B4Zi4  +  6i 


where  (using  the  notation  of  section  4.1): 


xi  sinBi  -  yi  cosBi  for  sensor  I's  bearing  data 
x2  sin$2  ~  y2  cosB 2  for  sensor  2's  bearing  data 

sin  B  1 


f°  (l  +  **  C°S  Bl  +  *l 
fx 


-1 


For  sensor 
data 


£. 


.  x,  cos  B2  +  y2  sin  B2 
—  11  +  -  1  -II 


For  sensor  2 
data 


Bi.  B2,  Bj,  B4)  «  (xQ,  yQ,  xQ,  yQ) 


r(sin8i,  -  cosBi,  At  sinBi,  -  At  cosBi) 
for  sensor  l's  bearing  data 


Zi,'  Z13-  Zi,>  *< 


(sinB2.  -  cosB 2,  At  sinB2,  -  At  COS82) 
for  sensor  2's  bearing  data 
(0,  0  cosBi,  sinBi)  for  sensor  l's 
frequency  data 

(0,  0  cosB  2 ,  sinS2)  for  sensor  2's 
frequency  data 


frequency 


frequency 
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€*  -  the  measurement  error  associated  with 

*•  £1. 

the  i  observation. 

When  using  two  sensor  data  to  estimate  the 
initial  state  vector  of  a  particular  target,  there  are  only  two 
possibilities : 

1)  The  two  clusters  contain  observations  which 
generate  estimates  for  an  actual  target ,  or 

2)  The  two  clusters  contain  information  which 
leads  to  estimates  for  a  spurious  target. 

The  problem  is  to  separate  spurious  target  estimates  from  real 
ones.  Figure  5.1  illustrates  the  potential  number  of  bearing 
intersections  for  even  a  simple  example.  Because  a  linear 
least  squares  procedure  is  used,  the  state  vector  estimated  by 
the  initial  guess  procedure  is  a  weighted  average  of  all  the 
measurements  from  both  clusters.  This  will  result  in  producing 
a  great  number  of  the  spurious  intersections  shown  in  Figure 
5.1.  The  question  of  separating  real  from  spurious  estimates 
can  then  be  viewed  as  one  of  determining  whether  the  two 
clusters  are  estimating  the  same  target  or  not. 


One  way  to  do  this  is  to  use  indicator  variables, 
that  is,  to  formulate  the  model  as  above  with  an  additional 
term  Z^» 


Mi  "  BlZil  +  B2Zi2  +  B3Zi3  +  B4Z 


4Zi4  +B5Zi5 


where 


1  if  the  observation  comes  from  cluster  1 
0  if  the  observation  comes  from  cluster  2. 
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The  response  model  associated  with  this  is 

E(M^)  -  BlZl  +  B2Z2  +  B3Z3  +  B^Z^  +  B5 

for  observations  from  cluster  1, 

E(M^)  -  B1Z1  +  B2Z2  +  B3Z3  +  B4Z4 

for  observations  from  cluster  2. 

If  B^  is  significant,  then  there  is  a  significant  difference 
in  the  mean  response  of  the  system  to  the  data  from  the  two 
sensors.  In  essence,  B^  is  a  measure  of  the  degree  of 
averaging  required  between  the  two  data  sets  to  generate  an 
initial  state  vector  estimate.  A  large  amount  of  averaging 
implies  two  different  targets  are  producing  the  estimate,  a 
small  amount  of  averaging  implies  only  one  target  produced  the 
estimate. 


Thus,  the  two  sensor  consistency  test  reduces  to 
a  standard  linear  model  test  of  the  significance  of  the 
indicator  variable  Z^.  This  corresponds  to  the  hypothesis 
test 

Hq:  B^  ■  0  versus 

Hx:  B5  ¥  0 

with  test  statistic 


SSR  (Z s I Zi , 


Z2 ,  Z3,  zo  i  SSE  (Zi,  Z2  »  Z3,  Zi» ,  Z5) 
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where  SSR(Z^I Z^ ,Z2»Z2,Z^,)  is  the  reduction 

in  regression  sum  of  squares  caused  by  including 

Z5  * 

SSE(Z^ ,Z£ ,Z^ ,Z^ ,Z^)  is  the  error  sum  of 
squares  for  the  full  model , 

nl*  n2  ”  number  of  points  in  clusters  1  and  2,  and 
F*  -  has  an  F  distribution. 

The  decision  rule  is: 

if  F*  <  Fc  (1,  n^  +  n2  *  5)  accept  Hq 
if  F*  >  Fc  (1,  n^  +  x\2  “  5)  accept 

where 

Fc  (1,  n^  +  n2  -  5)  is  the  desired  critical 
point  of  the  F-distribution  with  1  and  n^  + 
n2  *5  degrees  of  freedom. 

Again,  because  the  cost  of  a  type  I  error  is  much  higher  than 
that  of  a  type  II  error,  the  alpha  level  for  this  test  was  set 
at  .005. 

5.3  Three  Sensor  Statistical  Compatability  Test 

Once  all  two-sensor  initial  guess  estimates  have 
been  examined,  attention  is  turned  to  potential  three-sensor 
Initial  guess  estimates.  As  with  the  two-sensor  estimates,  the 
three-sensor  state  estimates  are  made  up  of  weighted  averages 
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of  observations  taken  from  all  three  clusters.  Thus,  spurious 
target  estimates  which  satisfy  the  physical  constraints  of  the 
problem  often  occur. 

Figure  5.2  contains  a  flowchart  of  the  three 
sensor  statistical  test.  Note  that  for  a  particular 
three-sensor  combination  to  be  examined,  at  least  two  of  the 
associated  two-sensor  estimates  must  have  passed  the  two-sensor 
test.  Once  a  three-sensor  initial  guess  has  been  calculated,  a 
nonlinear  least  squares  hypothesis  test  developed  by  Gallant 
(Reference  7)  is  used  to  decide  whether  this  estimate  is 
statistically  equal  to  each  of  the  previously  computed 
two-sensor  estimates.  It  is  this  test  which  forms  the  heart  of 
the  three-sensor  compatability  test.  If  at  least  two  of  the 
associated  two-sensor  estimates  are  statistically  equal  to  the 
three-sensor  estimate,  then  the  three-sensor  estimate  is  passed 
on  for  further  consideration. 

The  intent  of  the  three-sensor  test  is  to  search 
for  consistency  among  the  already  computed  two-sensor  estimates 
and  the  current  three-sensor  estimate.  Requiring  agreement 
with  at  least  two  of  the  possible  two-sensor  estimates  prevents 
one  two-sensor  estimate  from  dominating  the  three-sensor 
estimate. 


The  Gallant  nonlinear  regression  test  (Reference 
7)  considers  the  hypothesis 

H  :  r-  versus 
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at  the  a-level  of  significance  where 

XQ  -  represents  the  initial  guess,  state 

vector  estimate  under  consideration. 

The  data,  M^,  are  measurements  assumed  to  be 
responses  to  the  input  vector  linked  by  the  non-linear 
regression  model 

Mi  ’  f(^t>  K,')  +  er 

Under  Che  null  hypothesis  Che  quantities 
Mi  -  f  (z^,  x^')  m  eL 

are  called  the  measurement  residuals  and  are  assumed  to  be 
independent,  normal,  random  variables  with  zero  mean.  If  each 
measurement  has  an  associated  positive  weight  W^,  then  the 
following  sum 


soO  *  ZVi 


is  the  weighted  residual  sum  of  squares  for 


rv‘  ■:*’ 

k ' 


Then ,  for 

regression  function  is: 


the  initial  guess  routine  the 


K  § 


M. 


*  C(xo  +  *o  Ati)  sin  Sis  “  (yo  +  ^o  Ati)  cos6isl 
+  (l-Z).[xo  cos  Sis  +  yQ  sin  Sis] 


S 


£ 


i 

V 
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where 


(xo»  y0»  xo»  yQ) 


Ull 
sin  8. 


cos  S 


initial  state  vector  of 
target , 

ci  “  to» 

sine  of  bearing  estimate 
from  sensor  s, 
cosine  of  bearing 
estimate  from  sensor  s, 
when  measurement  is 

crossed-bearing , 
when  measurement  is 

cros  sed- frequency . 


In  Gallant's  notation,  the  input  vector,  ^  for  each 
measurement  is: 

Z£ ,  Z3,  Z4)  *  (Z,  ^t^,  sin  cos  S^g) 

and  the  parameter  vector  is : 


xo  •  (x0»  v  V  yQ) 


The  test  statistic  used  is 


T(X') 


m  *  •  ’/  /  ’/  V.  **,  •*. 


v-v-v  v-y-  .*•  ,*vv 
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where 


are  values  for  the  previously 
computed  two  sensor  state 
vector  estimates, 
the  three-sensor  state  vector 
estimate. 


The  critical  point  is  given  by 


pF  <p,n-p) 
-  1  +  - - - 


where 


.  number  of  parameters  estimated 


-  number  of  data  points 


F0(p,  n-p)  - 


a  percentage  point  of  the 
F-distribution  with  p  numerator 
and  n-p  denominator  degrees  of 
freedom. 


Lastly,  the  decision  rule  is: 


if  T(X^)  <  c*, accept  Hc 

if  T(Xq)  >  c*. accept 


The  test  procedure  then  is 


Get  the  three-sensor  data  and  compute  an 
initial  state  vector  with  the  initial 

guess  procedure, 


Compute  S  , 
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3)  For  each  of  the  previously  computed 

two-sensor  estimates.  X'  associated  with 

o 

the  three  sensors  under  examination,  compute 
S(X^)  (using  the  three  sensor  data), 

4)  Perform  Gallant's  test  on  each  of  the 
two-sensor  estimates,  and 

5)  If  two  of  the  two-sensor  estimates  pass, 
then  pass  the  three-sensor  estimate  as  good, 
otherwise  fail  it. 

The  test  statistic  T  is  derived  by  finding  the 
likelihood  ratio  for  the  composite  hypothesis 

H  :  r-  3?  versus 
«!:  . 


The  exact  distribution  of  T  is  not  known,  but  Gallant  finds  a 
random  variable  which  is  asymptotic  to  T  and  whose  distribution 
is  known.  It  Is  from  this  asymptotic  distribution  that  C*  is 
determined.  Basically,  the  test  examines  the  ratio  of  the 
residual  sum  of  squares  for  the  two-sensor  and  three-sensor 
estimates.  By  definition,  the  two-sensor  estimates  will  always 
produce  larger  values  of  S(xT  than  the  three-sensor  estimate, 
so  Gallant's  test  provides  a  measure  of  what  is  too  large. 
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6.0  INTEGER  PROGRAMMING  AND  SCENARIO  SELECTION 

Up  to  this  point ,  the  MTTA  has  been  concerned 
primarily  with  examining  raulti-sonobuoy  data  and  organizing  it 
into  sets  of  observations  that  form  potential  tracks.  These 
potential  tracks  must  now  be  organized  into  likely  track 
scenarios.  That  is,  given  the  current  set  of  observations, 
determine  which  set  of  potential  tracks  most  likely  describes 
the  actual  events  taking  place.  With  even  a  small  number  of 
potential  tracks,  this  becomes  a  combinatorial  problem  of 
fairly  large  dimensions.  To  sift  through  the  various  potential 
scenarios  to  select  the  optimal  set,  Tracor  has  used  a  linear 
optimization  technique  known  as  Integer  programming  (IP) . 

The  most  general  integer  programming  formulation 
is : 

n 

min  c'x  "  •<£,  cixi 

such  that 

X  €  S  C  zn 

where 

c'  is  an  n-dimensional  cost  function 

S  is  the  constraint  set 

zn  is  the  set  of  all  n-dimensional  integer 
vectors. 

Typically,  integer  programming  problems  are  difficult  to  solve 
so  a  great  deal  of  research  has  concentrated  on  identifying  and 
finding  efficient  solution  techniques  for  certain  special 
classes  of  this  problem.  Fortunately,  the  scenario  selection 
problem  falls  into  a  well  known  problem  type  called  the  set 
partitioning  problem. 
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6.1  Set  Partitioning  Formulation 


Garfinkel  and  Nemhauser  (Reference  17)  give  the 
basic  formulation  of  the  set  partitioning  problem.  Consider  a 
set  I  ■  {l,  2 ,  ...»  m}  and  let  P  be  a  set  such  that  P  * 

?2>  •••»  where  Pj£  I  for  the  index  set  J  ■  {l,  ...» 

n}.  Then  a  subset  J*  of  J  defines  a  partition  of  I  if 


U 

j€J* 


I 


and  for  j  ,  k  €  J*,  j  jk  k  implies 

pjfK  -  *- 

Thus,  a  partition  of  I  consists  of  a  set  of  disjoint  subsets  of 
I  such  that  their  union  is  equal  to  I. 


Let  a  cost ,  c  j  ,  be  associated  with  each  j  €  J 
so  that  the  total  cost  associated  with  a  given  partition  is 
The  set  partitioning  problem  is  to  then  find  the 


jS* 


cj 


partition  of  I  having  the  minimum  cost.  Written  as  an  IP,  this 
problem  is : 


S W. 


n 

min  ]£  c,  x, 

J*1  n 

subject  to  J  a,.x,  -  1 ,  i  ■  1 , . . .  ,m 

j-1  J  2 

x^  -  0,  1;  J  -  l,...n 
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where  Xj  =  ( 1  if  j  is  in  the  cover 

)0  otherwise 

•«  -  lf  1  e  pj 

(0  otherwise. 

The  optimal  solution  to  this  problem,  if  one  exists,  will  yield 
an  x-vector  of  zero's  and  l's  such  that  a  1  in  the  kcn 
element  of  x  implies  P.  is  in  the  partition  and  a  zero  in  the 

t*  Vi  ^ 

ktn  position  implies  is  not  in  the  partition.  This 

problem  can  be  rewritten  in  vector  notation  as 

min  c ' x 

subject  to  Ax  *  1 
where 


1  is  a  column  vector  of  l's 

c  -  (c^  c2, . cn) 

x  ■  (x^,  x2,  . ..,  xfl)  x^  as  above 

A  "  (aij)mxn  aij  as  above* 

Note  that  A  has  m  rows  an  n  columns.  The  ktb  column  of  A 
corresponds  to  P^.  The  mtb  row  of  column  k  has  a  1  in  it 
if  m  is  in  P^  and  is  zero  otherwise. 
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6.2  Applying  Set  Partitioning  to  the  Scenario 

Selection  -  Constraints 


To  see  that  the  scenario  selection  problem  can  be 
cast  as  a  0-1  set  partitioning  problem  let: 


|K1’  K2,...,  Kn^,  Kni+1»-*-»  Kn;L+n2> 

n^  fn 2"b  1  %  •  •  •  >  ni+n2+ •  •  •  ^  q ~ X  * 
n 1 1  a 2 4- « a ,  I  n ^  ^  1 1 ) » •  •  |  ni+n2 • . .  t 


where  K, , . . . ,  K 
l '  n. 


represent  the  n^  clusters 
associated  with  sensor  1 


K  K 

, . . .  ,  n^-m2 


represent  the  n2  clusters 
associated  with  sensor  2 


nl+n2  *  *  •+nn-i+1> 


n-1 


represent  the  nn  clusters 


• ,  n^+n2+. .  -+n 


n 


associated  with  sensor  n; 


Then  a  particular  track  scenario  can  be  represented  by 


T*  -  T,  U  T,  U  ...  U  T, 
J1  ^2  Jk 


with  cost 

ni-hri2+-  •  •  +n 


n 


T  cjxj  ”  cj.+  cu  +  +  cjv 

j-i 


where 


represents  a  given  track, 
i.e.  a  partition  of  K, 
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x 


J 


X  . 


J 


/ 


1 

0 


a  binary  variable  such  that 

if  track  j  is  in  the  scenario 

if  track  j  is  not  in  the 
scenario  j 


Cj 


the  cost  of  track  j . 


Forming  the  complement  set  T.  as 

Jk+1 


T,  -  K-T* 

Jk+1 

we  can  write  K  as  the  union 
scenario,  i.e. 


of  the  T,  's  for  any  track 
Ji 


K«T.  UT.  U...U  T,  U  T. 

J1  J2  Jk  Jk+1‘ 

Lastly,  if  we  allow  a  particular  cluster  to  appear  in  one  and 

only  one  of  the  T,  's  for  a  particular  scenario  we  have  that 

Ji 

j,  t  j.  implies  T,  H  T,  -  0. 

1  *  Ji  Jk 

Therefore,  for  any  track  scenario,  the  ^j's  form  a 
partition  of  K,  the  set  of  all  clusters.  If  each  cluster  in 
the  complement  set  Tj  is  considered  to  be  a  potential 

track  made  up  of  a  sin^eton  cluster  with  zero  cost,  then  the 
problem  may  be  written  as : 

min  c ' x 
subject  to 


Ax  +  IS  -  1 
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where 


(cl,  C2,  •••»  cn)  cost  vector 


(xi»  x2  *  •  *»  xn> 


if  track  is  in  scenario 

if  track  is  not  in  scenario 


^ij^mxn 


if  cluster  is  in  track 
if  cluster  is  not  in  track  Tj 


m  x  m  identity  matrix 


(s^,  S2»  •••»  sm)  vector  of  slack 

variables  having  zero  cost 

(l  if  cluster  is  in  complement  set 
[0  otherwise 


1  *  column  vector  of  all  ones. 

6.3  Applying  Set  Partitioning  to  Scenario  Selection  - 

Cost  Function 

This  section  derives  the  cost  function  used  in 
the  IP  formulation  of  the  scenario  selection  problem.  As 
explained  in  previous  reports  (References  2  and  3),  Tracor* s 
HTA  uses  an  extended  Kalman  filter  to  sequentially  track  a 
given  target.  The  filter  is  initialized  by  a  batch, 
least-squares  algorithm  that  uses  a  constant  acceleration 
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motion  model  for  target  trajectories.  Two  measurement  types, 
bearing  and  Doppler  shifted  frequency,  are  used  by  the 
measurement  model. 


The  measurement  model  of  the  Kalman  filter 
produces  a  residual  stream  as  data  are  processed. 


ei 


where  Y^  -  observed  measurement  value 

A 

Y^  -  predicted  measurement  value. 

The  e^'s  are  assumed  to  be  independent,  normal  random 

variables  with  zero  mean  and  variance  w  .  Thus  for  a 

e 

particular  track,  the  likelihood  function  for  the  measurement 
model  residual  stream  can  be  computed.  For  track  T^ 
possessing  n^  observations,  this  function  becomes: 


rH  (ei /wj ) 2  (e2/w2)2 


\/2tt  w,  \/2tt  w: 


(2Tr)ni^2(wi ,  w2  .  .  .w  )H 

ni 


(e  /w  ) 
e  ni  ni 


V  2  *  wn. 


Tlj 


Let  S 


a 


*  (Ti.»  Ti2 .  Tinibe 

track  scenario,  then  the  likelihood  of  Sj  is  definea  as: 


L<si>-Li.  Li 

nj 

where  is  the  likelihood  function  for  the  measurement 
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model  residual  stream  corresponding  to  track  T.  .  If  x, 

Lk  1 

is  a  variable  such  that 


xi 


1  if  T^  is  in  scenario  i 
0  otherwise. 


then  the  above  can  be  written  as 


3C  X 

L  (st)  -  L(S)  -  Lx  1  L2  2 


where 


x.  -1, 

n 


x.  -1, 
12 


and 


all 


other 


In  general  then, 

X-.  x0  x„ 

L(S)  -  Lx  X*L2  ^*...*Ln  n 


and  a  reasonable  scenario  selection  candidate  would  be  that  set 
of  x^'s  which  maximizes  L(S).  This  is  equivalent  to  finding 
that  set  of  x^'s  which  minimizes 


-2  In  L(S)  -  (-2  In  Lj)  +  x2  (-2  In  L2> 

+...+X  (-2  In  L  )  . 

n  n 


Thus,  for  the  scenario  selection  problem, 


n: 


n-i 


‘i  i 

c^  ■  -2  In  ■  n^  In  2tt  +  X!  +  E  (e^/W^)' 
L  i=l  1  i=l 


becomes  the  cost  associated  with  track  T^,  and  the  optimum 
scenario  is  the  one  that  maximizes  the  scenario  likelihood 
function. 
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6.4 


Integer  Programming  Summary 


In  summary,  it  is  important  to  keep  in  mind  the 
following  aspects  of  the  MTTA  integer  programming  formulation. 
For  the  given  scenario  under  consideration  the  members  of  the 
component  tracks ,  Tj  »  Tj  , . . . ,  Tj  *  are  determined 
by  the  initial  guess  routine ,  ^the  membeA  of  the  complement 
set,  Tj  ,  are  the  clusters  that  are  "left  over."  The 
cost  Cj  ,k+lf  a  particular  track  Tj  is  determined  by  applying 
the  HTA  to  the  data  contained  in  the  clusters  which  comprise 


The  solution  algorithm  is  a  list  based  search 
procedure  described  in  Garfinkel  and  Nemhauser  (Reference  17) 
and,  while  there  are  potentially  faster  algorithms  available, 
this  formulation  was  felt  to  be  sufficient  at  the  present 
time.  Larger  problems  will  likely  force  consideration  of 

faster  solution  algorithms. 

Finally,  it  should  be  noted  that  Morefield 
(Reference  8)  has  also  used  Integer  programming  to  solve  the 
intersensor  correlation  problem.  He  used  a  set  packing 
algorithm  which,  with  the  addition  of  slack  variables,  is 
equivalent  to  the  set  partitioning  algorithm.  However,  his 

rationale  for  casting  the  problem  in  this  framework  is 

substantially  different,  using  a  Baysian  approach  to  track 
formation  and  his  cost  function  is  slightly  different.  More 

importantly,  he  uses  data  types  which  Ignore  the  passive 
initialization  problem  and  he  works  with  substantially  smaller 
data  sets.  Because  of  this  Morefield  used  neither  clustering 
techniques  or  initial  guess  procedures  to  prune  the  data, 
relying  on  the  IP  to  handle  these  chores. 
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7.0  RESULTS 

7.1  Introduction 

This  section  presents  the  results  of  applying 
Tracor's  MTTA  to  several  different  simulated  multi-target 
scenarios.  Topics  discussed  in  this  section  include: 

1)  Generation  of  simulated  multi-target  data, 

2)  A  three  target  scenario  with  good  geometry, 
strong  signal,  and  a  threshold  of  0  dB  in  a 
1  Hz  band, 

3)  A  three  target  scenario  with  poor  geometry, 
strong  signal,  and  a  threshold  of  0  dB  in  a 
1  Hz  band, 

4)  A  three  target  scenario  with  good  geometry, 
moderate  signal,  and  no  threshold,  and 

5)  A  two  target  scenario  with  good  geometry 
strong  signal,  and  a  threshold  of  0  dB  in  a 
1  Hz  band. 

The  section  corresponding  to  topic  (2)  contains  a  detailed 
discussion  that  illustrates  the  performance  of  the  MTTA. 
Figures  and  tables  are  presented  which  show  the  results  of  the 
data  clustering,  automatic  cluster  extraction,  initial  guess 
estimation,  and  integer  programming  modules. 

All  scenarios  generated  were  single  trial  runs, 
thus  the  results  presented  are  not  Monte  Carlo  averages  and 
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reflect  more  variation  than  do  typical  Monte  Carlo  results. 
All  scenarios  were  of  six  minute  duration  and  employed  three 
sensors  to  gather  data.  Computing  was  performed  on  a  UNIVAC 
1100/61  where  the  run  times  per  scenario  ranged  from  seven  to 
nine  minutes  depending  on  the  scenario  complexity  and  the 
received  signal  strength.  This  point  should  be  noted,  Tracor' s 
MTTA  is  currently  running  in  near  real  time.  With  only  a 
minimal  amount  of  parallel  processing,  i.e.  ,  separate 
processors  to  cluster  each  buoy's  data  and  two  processors  to 
evaluate  potential  tracks  for  the  IP  module,  the  algorithm 
could  easily  be  made  to  run  in  real  time.  Because  the  MTTA 
uses  relatively  little  storage  space  (less  than  64  K  words  for 
the  clustering  algorithm  and  less  than  30  K  words  for  all  other 
modules)  and  requires  only  six  to  seven  significant  digits  of 
numerical  accuracy,  current  parallel  processing  options  would 
be  more  than  adequate  for  MTTA  computations. 

7 . 2  Scenario  Data  Generation 


To  study  Tracor 's  MTTA  tracking  capabilities, 
multi-target  data  were  needed  to  test  the  tracker.  Since  no 
real  multiple  target  data  sets  were  readily  available, 
simulated  multiple  target  data  were  generated  to  fill  this 
need.  Following  is  a  brief  discusssion  of  the  type  of 
sonobuoys  modeled,  the  type  of  trajectories  used  by  the 
targets,  and  the  target  signal  strength  levels  that  were  used 
to  simulate  the  multi-target  data. 

7.2.1  Sensor  Locations 


For  all  the  geometries  investigated,  three 
non-moving  passive  sensors  were  used  to  gather  data  from  the 
targets.  These  sensors  were  deployed  in  an  equilateral, 
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triangular  arrangement  that  is  known  as  a  tri-tac  pattern.  The 
baseline  distance  between  each  of  the  sonobuoys  was  fixed  at 
7000  m.  This  arrangement  was  felt  to  be  sufficient  to  insure 
that  each  of  the  targets  could  be  observed  by  all  the  sensors. 

7.2.2  Target  Trajectories 

Three  different  multi-target  geometries  were 
simulated  for  the  current  study.  For  each  geometry,  all  the 
targets  followed  unique,  constant  velocity,  constant  heading 
trajectories  for  six  minute  durations.  These  trajectories  were 
specifically  designed  to  prevent  track  intersections  during 
this  six  minute  time  span.  The  three  geometries  used  are 
summarized  in  Tables  7.I-III.  These  scenarios  were  used  to 
determine  if  certain  geometric  and  dynamic  combinations  could 
be  formed  that  might  prevent  the  MTTA  from  tracking  all  the 
targets.  As  will  be  discussed,  the  scenarios  described  in 
Tables  7.1  and  7. Ill  proved  to  be  favorable  observation 
geometries  whereas  the  scenario  described  in  Table  7. II  proved 
to  be  an  unfavorable  multi-target  geometry  with  the  MTTA 
failing  to  track  one  of  the  three  targets. 

7.2.3  Simulated  DIFAR  Multi-Target  Data 


After  the  three  scenarios  described  above  were 
created,  simulated  sonobuoy  measurement  data  for  the  targets 
were  needed  so  that  the  MTTA  could  attempt  to  reconstruct  the 
tracks  for  all  the  targets.  To  generate  the  data  for 
individual  targets,  a  DIFAR  simulator  (References  2  and  3)  was 
used  to  generate  frequency  and  bearing  measurements .  This 
simulator  used  a  peak  picking  method  to  estimate  the  frequency 
from  a  normalized  frequency  spectrum  and  an  arctangent 
processor  to  estimate  the  bearing.  For  this  simulation,  a  data 
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TABLE  7.  II 

DESCRIPTION  OF  THE  3  TARGET 
TRACKS  FOR  SCENARIO  2 


Buoy  Positions 


HEZ£S)i]HI 

wMwcjiyi 

l 

-3500 

BHj 

0 

2 

0 

0 

3 

3500 

0 

Initial  Target  Tracks  (tQ=0s.) 


Target  # 

masm 

WKSBSSSBM 

Course  Heading  (°) 

1 

-3000 

-750 

6 

15 

2 

2500 

500 

9 

90 

3 

-500 

4500 

4 

300 

Final  Target  Tracks  (t£=360s) 


Target  # 

X0(m) 

Y0(m) 

Vo(m/s) 

Course  Heading  (®)i 

‘ 

1 

■ai 

-191 

6 

15 

2 

1 

3740 

9 

90 

3 

220 

3253 

4 

300 
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update  rate  of  10  sec.  was  used  to  make  the  measurement,  and 
since  the  accuracy  of  the  frequency  estimates  is  inversely 
proportional  to  the  update  rate,  the  frequency  measurements 
were  limited  to  a  0.1  Hz  resolution.  As  described  before 
(References  2  and  3) ,  this  data  generation  program  is  a  fairly 
realistic  model  which  outputs  non-Gaussian  measurements  for  the 
individual  targets. 

To  form  the  multi-target  data  needed  for  the 
current  investigation,  individual  target  data  sets  were 
generated  for  each  of  the  targets  present  in  a  given  scenario. 
Then,  the  individual  target  data  sets  were  merged  for  each  of 
the  sonobuoys  to  produce  what  was  referred  to  last  year  as 
multiple  linetracker  data  (Reference  3)  .  Certain  weaknesses 
were  acknowledged  previously  in  simulating  the  data  in  this 
fashion,  but  overall,  this  type  of  multiple  target  data  was 
felt  to  be  more  than  adequate  for  testing  the  MTTA. 

For  all  but  one  case,  each  target  transmitted  a 
tone  whose  SNR  level  as  measured  1  yard  from  the  source  was 
simulated  as  80  dB  in  a  1  Hz  band.  For  these  cases,  the  DIFAR 
simulator  used  a  threshold  of  0  dB  in  a  1  Hz  band  that  had  to 
be  met  or  exceeded  before  any  measurement  estimates  were 
output.  If  one  uses  a  20  log  R  approximation  for  the 
propagation  loss,  where  R  is  the  magnitude  of  the  distance  from 
the  source  to  the  receiver,  an  80  dB  source  level  without  any 
fluctuation  could  be  heard  10,000  yards  away  and  just  meet  the 
0  dB  threshold  criterion.  However,  the  DIFAR  simulator 
Introduces  some  random  noise  terms  to  model  random  fluctuations 
in  both  the  target  signal  source  level  and  the  ambient  noise 
level,  so  the  absolute  maximum  observation  range  for  the 
sonobuoys  cannot  be  set. 
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One  other  case  was  simulated  which  used  a  lower 
SNR  value  for  the  target  source  level,  but  eliminated  the 
thresholding  criterion  so  that  measurement  estimates  were  made 
for  all  time  updates.  Usually,  threshold  levels  are  set  so 
that  most  target  measurements  exceed  the  level  but  no,  or 
almost  no,  random  noise  measurements  are  introduced  into  the 
data  set.  Unfortunately,  such  thresholding  techniques  can 
eliminate  actual  measurements  in  order  to  insure  that  no  noise 
is  included  in  the  data.  For  this  particular  case,  the 
threshold  was  dropped  to  see  if  the  sensors  could  pick  up  a 
weaker  signal  than  that  used  in  the  other  simulations.  The 
target  source  level  was  lowered  by  3  dB  to  77  dB  in  a  1  Hz 
band.  This  3  dB  loss  effectively  decreased  by  approximately 
30%  the  range  at  which  the  signal  could  be  heard.  To 
compensate  for  this  loss  in  range  of  the  detection  system,  the 
threshold  was  eliminated  so  that  measurement  updates  were 
always  made.  The  clustering  algorithms  were  then  used  to  pick 
the  true  measurements  from  the  random  noise  found  in  the  data. 
The  results  from  eliminating  the  measurement  threshold  and  then 
clustering  the  data  to  separate  true  target  data  from  random 
noise  are  presented  in  a  subsequent  subsection. 

7.3  Three  Target  Scenario,  Good  Geometry,  Strong 

Signal 

7.3.1  Introduction 


This  section  contains  a  discussion  of  the  results 
obtained  by  applying  the  MTTA  to  a  three  target  scenario  with 
good  observation  geometry  in  a  low  noise  environment.  It  is 
very  likely  that  no  multi-target  algorithm  will  be  able  to 
handle  all  possible  observation  geometries,  so  this  report 
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contains  examples  of  the  MTTA's  performance  for  both  good  and 
bad  observation  geometries.  Furthermore,  real  ocean  ambient 
noise  levels  and  target  source  levels  vary  drastically  from  one 
situation  to  the  next,  so  detected  SNR's  vary  greatly  for 
different  encounters.  This  variation  in  detected  SNR's 
seriously  affects  the  quality  of  the  measurements  which  in  turn 
greatly  affects  tracking  accuracy.  By  running  a  three  target 
scenario  under  various  degrees  of  signal  degradation,  it  was 
hoped  that  a  better  idea  of  algorithm  performance  could  be 
obtained . 


Figure  7.1  contains  a  plot  of  this  scenario.  The 
plus  signs  represent  the  buoys,  the  solid  lines  the  actual 
trajectories,  and  the  dashed  lines  MTTA's  estimated 
trajectories.  Detailed  results  of  the  MTTA's  solutions  from 
each  module  are  presented  next. 


7.3.2 


Data  Clustering  Results  -  Buoys  1  and  3 


Table  7 .  IV  contains  the  frequency,  bearing  sine, 
and  bearing  cosine  measurements  generated  by  buoy  1 ,  while 
Table  7.V  contains  the  measurements  generated  by  buoy  3  of  the 
scenario.  The  column  headed  'TARGET'  indicates  whether  the 
measurement  came  from  target  1,  target  2,  or  target  3. 

Tables  7  .VI  and  7. VII  present  the  clusters 
generated  by  Ling's  algorithm  (for  buoys  1  and  3, 
respectively),  the  associated  cluster  level  or  node  number  (see 
section  3.2),  and  the  object  numbers  for  the  members  of  each 
cluster.  Examination  of  these  tables  illustrates  sever*' 
points : 
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TABLE  7. IV 


SIMULATED  MULTI-TARGET  DIFAR  DATA  FOR 
BUOY  1  OF  SCENARIO  1 


SAMPLE 

NO. 


TIME 


FREQUENCY 


COS(fS) 


SIN(b) 


TARGET 

NQ. 


1 

15.0000 

150.000 

2 

15.0000 

ISO. 100 

3 

25.0330 

149.90C 

4 

25.0000 

150.130 

5 

35.003C 

149.900 

6 

35.0030 

150.200 

7 

35.0000 

150.100 

8 

*5.0030 

1*9.900 

9 

55.0300 

1*9.900 

10 

55.3030 

150.200 

11 

55.0233 

ISO. 100 

12 

65.0230 

l*o.9Q0 

13 

65.0300 

150.100 

14 

75.3300 

150.100 

15 

85.0000 

1*9.800 

16 

85.3000 

150.000 

17 

85.J300 

150.100 

16 

95.0302 

149.8QC 

19 

95.0000 

150.000 

20 

95.0330 

lsn.ioo 

21 

105.000 

1*9.900 

22 

115.330 

1*0.803 

23 

115.000 

149.900 

24 

1 15.200 

150.102 

25 

125.000 

1*9.800 

26 

125.000 

1*9.900 

27 

125.030 

150.100 

28 

135.202 

1*9.802 

29 

135.330 

1*9.902 

30 

135.330 

152.100 

31 

1*5.030 

1*9.800 

32 

145.333 

149.900 

33 

155.033 

149.700 

.308?*3  -.951275  1 
•655623  .755089  3 
.348633  -.93733*  1 
.6*2862  .765994  3 
•  39 IP  OP  -.920050  1 
.97QP26  -.239794  2 
.778779  .627298  '  3 
•37Q085  -.92899?  I 
.32887*  -.944374  1 
.996*33  -.643933-001  2 
.661244  .750171  3 
.447230  -.894*19  1 
.638200  .769871  3 
.55656*  .830792  3 
•560186  -.828367  1 
.999073  -.430587-001  2 
•741C66  .671*30  3 
.529*4*'  -.6483*3  1 
.997932  -.6*2e*l-CDl  2 
.698291  .71581*  3 
.9*90*6  .1*7611  2 
.756*62  -.654037  I 
.9*9826  -.186*46-001  2 
.712585  .701586  3 
.743048  -.669238  1 
.975569  .21969*  2 
.67262*  .73998*  3 
.617*9!  -  .78**78  1 
.975051  .221983  2 
. 79901“  .601313  3 
.682*99  -.730887  1 
.993101  .117262  2 
.67162*  -.7*0892  I 
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TABLE  7. IV  (CONCLUDED) 
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SAMPLE 

NO. 


TIME 


FREQUENCY 


COS (6) 


S IN ( 8 ) 


TARGET 

NO. 


34 

155-000 

149.900 

•906245 

•418439 

2 

35 

165.300 

149. 70C 

.610771 

-.791807 

1 

36 

165. OOC 

149. 90C 

.999421 

•340218- C01 

2 

37 

175.300 

140. 73C 

.731946 

-.712233 

1 

38 

175.000 

149.900 

.987672 

.156538 

2 

39 

175. COO 

150.100 

•819604 

.572931 

3 

40 

185.800 

149.700 

•689152 

-.724617 

1 

41 

165.300 

150. IOC 

•  777987 

.628405 

3 

42 

195. 3C0 

149 .400 

.999923 

-.126606-001 

2 

43 

195.300 

153.10? 

•806951 

.587876 

3 

44 

205.030 

149.700 

.745327 

-.666699 

1 

45 

205.300 

149. POP 

.973447 

•228911 

2 

46 

205.000 

150.100 

.797952 

•602721 

3 

47 

215.330 

149.700 

.761986 

-.647593 

1 

4€ 

215.303 

149. ?or 

.963391 

•268399 

2 

49 

215.800 

153.10? 

•79Q40P 

.612580 

3 

50 

225.000 

149.700 

.777883 

-.628409 

1 

51 

235.000 

149.7Q0 

.792367 

-.610045 

I 

52 

235.000 

149.900 

.935303 

.353854 

2 

S3 

235.338 

153. IOC 

.759726 

.650244 

3 

54 

245. OSC 

149.60? 

.617723 

-.575616 

1 

55 

245.303 

153.Q00 

.696261 

.71776® 

3 

56 

255.000 

149.600 

•845389 

-.534151 

1 

57 

255.000 

153.003 

•813719 

.581259 

3 

58 

265.000 

149. 60C 

•617620 

-.575759 

1 

59 

265.000 

150.003 

.775536 

.651704 

3 

60 

275.000 

149.600 

•  86644** 

-.495783 

I 

61 

295.300 

149.630 

•863743 

-.53393? 

1 

62 

295.000 

149. 78P 

•917841 

•396«47 

2 

63 

295.000 

150. OOC 

.796430 

.634730 

3 

64 

305.300 

149.600 

•694648 

-.466260 

1 

65 

31S.30C 

149.60'' 

.87267? 

-.488667 

I 

66 

325. OOC 

149. 60C 

.84156? 

-.543163 

1 

67 

325. 30P 

149.7Q? 

.94858? 

.316532 

2 

68 

325.000 

150.000 

•82189? 

•569646 

3 

69 

335.000 

149.603 

•894865 

-.447937 

1 

70 

335.000 

149.700 

•966681 

•255982 

2 

71 

335.000 

153. CO? 

.771498 

.636231 

3 

72 

345.300 

149. 60C 

.9*666® 

-.417516 

I 

73 

345.000 

153. OOC 

.911«35 

.41  1444 

3 

74 

355.300 

149.600 

.919773 

-.392450 

1 

75 

355.303 

150.300 

•812333 

•593199 

3 

90 


SAMPLE 

NO. 


TABLE  7 . V 

SIMULATED  MULT I -TARGET  DIFAR  DATA  FOR 
BUOY  3  OF  SCENARIO  I 


5 .ODOOO 

5 . ooooo 

15.0000 

15.0000 

25.0000 

25.0030 

25.0000 

35.0Q0C 

35.0000 

45.0000 

45.0000 

45.0000 

55.0300 

55.COOC 

65 .0000 

65.G00C 

75.0000 

75.0000 

85.0000 

85.0000 

95.0000 

1C5.300 

105.000 

115.030 

115.000 

125.300 
125. COO 

135.300 


145.300 

155.003 

155.030 

155.030 


FREQUENCY 

COS (3) 

SIS(B) 

150.600 

-.957011 

-.290051 

150.400 

-.881346 

-.472472 

150.600 

-.991340 

-.131323 

1 50 .300 

-.936695 

-.350145 

150. 600 

-•9964D9 

.846665- 

001 

150.200 

-.969837 

-.243756 

150.700 

“•790688 

•612219 

150.600 

-.988933 

-.148360 

150.100 

-.988085 

-.153907 

U3.60* 

-.938766 

-.344555 

150.100 

-.996927 

-.783859- 

001 

150.300 

-.707389 

.706824 

150.000 

-.999433 

-.336628- 

001 

150.300 

-.744818 

.667267 

ISO. OCC 

-.998722 

.505450- 

001 

150.300 

-.720066 

.693885 

149.90«3 

-.986110 

.153750 

150.300  • 

-.78758? 

.61620° 

150.600 

-.934025 

-.357209 

149.800 

-.969759 

.244063 

149.800 

-.959518 

.281646 

150.60C 

-.98513® 

-.171788 

149.600 

-.922599 

.385761 

149.600 

-.866289 

.499543 

150.300 

-.815628 

.578576 

150.600 

-.820882 

-.571098 

149.600 

-.84074? 

•541434 

150.600 

-.966779 

-.255615 

1*9. 500 

-.7«c70l 

.612719 

150.300 

-.669786 

.742015 

150.600 

-.896*14 

-.442408 

149.500 

-.777763 

.628533 

150.300 

-.731440 

.681906 
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TARGET 

NO. 


UMHUNlHNHUMNHMrOHWMUNUiroUNJHMHUlOHMHNH 


TABLE  7 . V  (CONTINUED) 


Mvir  LiLj 

NO. 

TIME 

FREQUENCY 

34 

165.300 

150.600 

3S 

165.000 

149.400 

36 

165.005 

150.300 

37 

175.000 

153.630 

38 

175.300 

149.400 

39 

175.300 

150.300 

40 

185.030 

153.600 

41 

185.000 

149.400 

42 

1*5.330 

153.309 

43 

195. 3J0 

149. 30C 

44 

195.030 

150.300 

45 

205.330 

149.300 

46 

205.300 

150.300 

47 

215.030 

153.6C0 

48 

2*5.300 

149.300 

49 

225.330 

149.300 

50 

225.330 

150.300 

51. 

23*5.030 

150.600 

52 

235.000 

149.300 

53 

235.333 

153.300 

54 

245.330 

15Q.60C 

55 

245.33 C 

150.30? 

56 

25S.300 

149.300 

57 

255.330 

150.300 

58 

265.333 

150.600 

59 

265.030 

15? .300 

60 

275.330 

149.300 

61 

265. 33C 

149.30? 

62 

285.330 

150.300 

63 

295. 00C 

149.300 

64 

295.330 

150.30? 

65 

335.030 

159. 60C 

66 

305.C30 

149.20? 

67 

3C5.300 

150.30? 

68 

315.330 

149.200 

69 

315.039 

150.300 

7G 

325.30C 

150.60? 

71 

325.333 

153. 300 

72 

335. 33C 

150.60? 

73 

335.330 

149.200 

74 

335.330 

159.300 

75 

345.300 

149.203 

76 

345.33'* 

150.300 

COS(g) 

'•969606 
•727551 
■.79726? 
-.915253 
••738751 
-.645463 
••961503 
■.63132? 
■•7167  3* 
■•631385 
••712*2* 
*.*>93445 
••847*68 
••959561 
-.693*35 
*.546741 
*.82396? 
*.976378 
*•533599 
*.79420* 
*•9*1613 
*.766645 
*•492151 
*.733882 
*.948240 
■•  776e  1C 
-.45373? 
*•404696 
••7*1*06 
*•469664 
-.750253 
*•  9*6*7? 
-•39019* 
-.761725 
-.389568 
-• 708P 16 
-.982243 
-.862*3° 
-•  966°6  ! 
-  •  37  96  S  2 
-.777935 
-.43589! 
-.996776 
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SIN(S) 

-.244673 
.686053 
.603609 
-  .402879 
.673978 
.763794 
-.274796 
.799008 
•695316 
.775470 
.70174* 
.797404 
.530207 
-.281502 
.797110 
.835993 
•566638 
-.216070 
.845737 
.709998 
-.190884 
.63967? 
.870515 
.679277 
*.317555 
.627136 
.885976 
.914451 
•623521 
.882845 
.661151 
.552OQ7-C01 
.V20731 
.6*7900 
.92099* 
.706196 
••187611 
.505479 
.25492? 
.925117 
.628344 
.900500 
.422473 


TARGET 

NO. 


TABLE  7 .  V 

(CONCLUDED) 

SAMPLE 

NO. 

TIME 

FREQUENCY 

COS ( B ) 

SIN(B) 

TARGET 

NO. 

77 

356*000 

150*600 

-.9*6756 

-.216167 

1 

76 

355.300 

140.203 

-.26575° 

.964039 

1 

79 

355.330 

150.300 

-.62114? 

.570723 

3 
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HracorAppHed  Sciences 


1)  The  hierarchical  nature  of  the  data 
organization  imposed  by  the  algorithm; 

2)  The  varying  lengths  of  time  that  clusters 
remain  isolated  before  becoming  subsets  of 
other  larger  clusters;  and 

3)  All  clusters  are  eventually  merged  to  form  a 
cluster  containing  the  whole  set  of  data. 

Note,  that  for  buoy  3,  cluster  75  contains  exclusively  all  the 
points  generated  by  target  1,  cluster  76  does  the  same  for 
target  2,  and  cluster  55  contains  all  the  points  associated 
with  target  3.  Similarly  for  buoy  1,  cluster  71  contains  only 
target  1  data,  cluster  72  contains  target  2  data,  and  cluster 
11  contains  all  the  data  from  target  3.  Thus,  the  clustering 
algorithm  has  clearly  separated  the  target  data,  now  the  MTTA 
must  seek  to  automatically  identify  those  clusters  which  are 
most  likely  to  contain  the  desired  individual  target  data. 

Tables  7. VIII  and  7. IX  contain  the  isolation 
indices  and  survival  function  values  for  all  of  the  clusters 
generated  (see  sections  3.2  and  3.3).  Keeping  only  those 
clusters  whose  survival  function  values  are  less  than  10“^ , 
it  can  be  seen  that  not  only  do  all  the  candidate  clusters  pass 
the  isolation  test,  but  so  do  several  others.  Thus,  additional 
cluster  extraction  techniques  are  needed  to  pick  out  the 
appropriate  data.  This  is  discussed  in  the  next  section. 


TABLE  7. VI II 


SURVIVAL  FUNCTION  AND  ISOLATION  INDEX  VALUES  FOR 
CLUSTERS  FROM  BUOY  1  OF  SCENARIO  1 


ISOLATION  INDEX 

CLUSTER 

NO. 

1 

SURVIVAL 

FUNCTION  = 

.1830669-021 

73 

CLUSTER 

NO. 

- 

2 

SURVIVAL 

FUNCTION  = 

•8501675 

3 

CLUSTER 

NO. 

• 

3 

SURVIVAL 

FUNCTION  = 

•1487049 

35 

CLUSTER 

NO. 

- 

4 

SURVIVAL 

FUNCTION  = 

.2180120 

28 

CLUSTER 

NO. 

- 

K 

SUPVIVAl 

FUNCTION  = 

.3200111 

14 

CLUSTER 

NO. 

- 

6 

SURVIVAL 

FUNCTION  r 

.9302090-002 

85 

cluster 

NO. 

= 

7 

SURVIVAL 

FUNCTION  : 

.6555628-301 

45 

CLUSTER 

NO. 

• 

P 

SURVIVAL 

FUNCTION  = 

.9972366 

2 

CLUSTER 

NO. 

9 

SURVIVAL 

FUNCTION  r 

•4933486 

13 

CLUSTER 

NO. 

1C 

SURVIVAL 

FUNCTION  = 

.9218807 

1 

CLUSTER 

NO. 

= 

u 

survival 

FUNCTION  = 

.1117133-002 

62 

CLUSTER 

NO. 

- 

12 

SURVIVAL 

FUNCTION  = 

.7217808 

6 

CLUSTER 

NO. 

• 

13 

SURVIVAL 

FUNCTION  = 

•8046436 

4 

CLUSTER 

NO. 

- 

1* 

SUPVIVAL 

FUNCTION  : 

.4417269 

15 

CLUSTER 

NO. 

- 

15 

SURVIVAL 

FUNCTION  r 

•2850254 

23 

CLUSTER 

NO. 

- 

1  € 

SUPVIVAL 

FUNCTION  = 

.8969467 

2 

CLUSTER 

NO. 

17 

SURVIVAL 

FUNCTION  = 

.2293015 

18 

CLUSTER 

NO. 

- 

16 

SURVIVAL 

FUNCTION  = 

•1566442 

17 

CLUSTER 

NO. 

- 

19 

SURVIVAL 

FUNCTION  r 

.4666636 

7 

CLUSTER 

NO. 

= 

20 

survival 

FUNCTION  = 

.3173511 

21 

CLUSTER 

NC. 

= 

21 

SURVIVAL 

FUNCTION  r 

•6106839-002 

92 
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TABLE  7. VIII  (CONTINUED) 


ISOLATION  INDEX 


CLUSTER 

NO. 

= 

22 

SURVIVAL 

FUNCTION 

•6644110 

5 

CLUSTER 

NO. 

23 

SURVIVAL 

FUNCTION 

r 

.1072643-001 

72 

CLUSTER 

NO. 

- 

24 

SURVIVAL 

FUNCTION 

- 

•9803829-001 

17 

CLUSTER 

NO. 

= 

25 

SURVIVAL 

FUNCTION 

.7076964-002 

45 

cluster 

NO. 

= 

26 

SURVIVAL 

FUNCTION 

- 

.4401400 

10 

CLUSTER 

NO. 

- 

27 

SURVIVAL 

function 

- 

.2269791 

27 

CLUSTER 

NO. 

- 

20 

SURVIVAL 

FUNCTION 

«•» 

.3375559-001 

41 

CLUSTER 

NO. 

= 

2® 

SURVIVAL 

FUNCTION 

= 

.4659114 

4 

CLUSTER 

NO. 

= 

3P 

SURVIVAL 

FUNCTION 

• 

.1351269-001 

39 

CLUSTER 

NO. 

31 

survival 

FUNCTION 

z 

.6633140-031 

10 

CLUSTER 

NO. 

• 

3? 

survival 

function 

• 

.4384878 

10 

CLUSTER 

NO. 

33 

SURVIVAL 

FUNCTION 

T 

.1612562*001 

25 

CLUSTER 

NO. 

- 

34 

SURVIVAL 

function 

= 

•6437461 

8 

CLUSTER 

NO. 

z 

35 

SURVIVAL 

FUNCTION 

= 

.2384186-006 

173 

CLUSTER 

NO. 

- 

36 

SURVIVAL 

function 

3 

.4368393 

15 

CLUSTER 

NO. 

= 

37 

SURVIVAL 

FUNCTION 

z 

•5632228-001 

26 

CLUSTER 

NO. 

- 

30 

SURVIVAL 

FUNCTION 

z 

.5559623-034 

116 

CLUSTER 

NO. 

= 

39 

SURVIVAL 

FUNCTION 

Z 

.7001826 

3 

CLUSTER 

NO. 

= 

40 

SURVIVAL 

FUNCTION 

Z 

.8179311-021 

45 

CLUSTER 

NO. 

= 

41 

SURVIVAL 

FUNCTION 

Z 

.2886278 

9 

CLUSTER 

NO. 

= 

42 

SURVIVAL 

function 

Z 

.4729810 

9 

CLUSTER 

NO. 

m 

43 

SURVIVAL 

FUNCTION 

Z 

.2165139-004 

64 
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TABLE  7. VIII  (CONTINUED) 


CLUSTER  NO.  =  44 
CLUSTER  NO.  =  45 
CLUSTER  NO.  =  46 
CLUSTER  NO.  =  47 
CLUSTER  NO.  =  48 
CLUSTER  NO.  =  49 
CLUSTER  NO.  =  5C 
CLUSTER  NO.  =  51 
CLUSTER  NO.  =  52 
CLUSTER  NO.  =  53 
CLUSTER  NO.  r  54 
CLUSTER  NO.  =  55 
CLUSTER  NO.  =  56 

CLUSTER  NO.  =  57 

✓ 

CLUSTER  NO.  r  58 
CLUSTER  NO.  =  59 
CLUSTER  NO.  Z  6C 

cluster  no.  -  61 

CLUSTER  NO.  =  62 
CLUSTER  NO.  =  63 
CLUSTER  NO.  Z  64 


ISOLATION  INDEX 


SURVIVAL 

FUNCTION 

.2652407-005 - 

66 

SURVIVAL 

FUNCTION 

= 

.4619360-00& 

149 

SURVIVAL 

FUNCTION 

3 

•6786491 

2 

SURVIVAL 

FUNCTION 

3 

•2166149 

11 

SURVIVAL 

FUNCTION 

3 

.2682209-006 

142 

survival 

function 

- 

.3830850-002 

25 

SURVIVAL 

FUNCTION 

3 

.5722709 

10 

SURVIVAL 

FUNCTION 

3 

•1132821 

13 

SURVIVAL 

FUNCTION 

= 

.8452478-001 

22 

SURVIVAL 

FUNCTION 

r 

•2264977 -005 

153 

SURVIVAL 

function 

3 

.3129244-006 

127 

SURVIVAL 

FUNCTION 

r 

.1102686-005 

165 

SURVIVAL 

FUNCTION 

= 

.7777778 

1 

SURVIVAL 

FUNCTION 

3 

.1671463-003 

31 

SURVIVAL 

FUNCTION 

r 

.4172325-006 

104 

survival 

function 

3 

.1415610-005 

68 

SURVIVAL 

FUNCTION 

3 

.1192093-006 

92 

survival 

FUNCTION 

- 

.1293644-002 

57 

SURVIVAL 

FUNCTION 

• 

.1404643-002 

28 

SUPVIVAL 

FUNCTION 

3 

•  1 -091&1 

5 

SURVIVAL 

FUNCTION 

z 

.4649013-003 

43 

(CONTINUED) 
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ISOLATION  INDEX 


CLUSTER 

NO. 

65 

SURVIVAL 

FUNCTION 

r 

.1311105-006 

42 

CLUSTER 

NO. 

• 

66 

SURVIVAL 

FUNCTION 

z 

.1985213 

5 

cluster  no. 

- 

67 

SURVIVAL 

FUNCTION 

z 

.8940697-007 

229 

CLUSTER 

NO. 

- 

68 

survival 

FUNCTION 

- 

.2439254 

4 

CLUSTER 

NO. 

- 

6* 

SURVIVAL 

FUNCTION 

- 

.47Q8424-002 

14 

CLUSTER 

NO. 

• 

70 

survival 

FUNCTION 

- 

.3411472-003 

17 

CLUSTER 

NO. 

• 

71 

survival 

FUNCTION 

s 

•1043081-006 

600 

CLUSTER 

NO. 

= 

72 

survival 

FUNCTION 

r 

.1043091-006 

190 

CLUSTER 

NO. 

73 

SURVIVAL 

FUNCTION 

z 

•1043081-006 

409 

TABLE  7. IX 


SURVIVAL  FUNCTION  AND  ISOLATION  INDEX  VALUES  FOR 
CLUSTERS  FROM  BUOY  3  OF  SCENARIO  1 

ISOLATION  INDEX 


CLUSTER 

NO. 

- 

i 

SURVIVAL 

FUNCTION 

• 

•48691*0 

14 

CLUSTER 

NO. 

= 

2 

SURVIVAL 

FUNCTION 

r 

.6630117 

8 

CLUSTER 

NO- 

= 

3 

SURVIVAL 

FUNCTION 

.2757257 

25 

CLUSTER 

NO. 

= 

4 

SUPVIVaL 

function 

= 

.4042685-004 

191 

cluster 

NO. 

s 

e 

survival 

FUNCTION 

z 

.4864466 

14 

•CLUSTER 

NO. 

= 

6 

SURVIVAL 

function 

r 

•2124872 

30 

CLUSTER 

NO- 

7 

SURVIVAL 

FUNCTION 

- 

.7896106-001 

49 

CLUSTER 

NO. 

= 

8 

SURVIVAL 

function 

= 

.8570225 

3 

CLUSTER 

NO. 

= 

9 

survival 

FUNCTION 

z 

.5674055 

11 

CLUSTER 

NO. 

= 

1C 

SURVIVAL 

FUNCTION 

• 

.1141836 

28 

CLUSTER 

NO. 

= 

11 

$U*VIVAL 

function 

.5789733 

8 

CLUSTER 

NO. 

12 

SURVIVAL 

function 

- 

.6948277-002 

95 

CLUSTER 

NO. 

• 

13 

survival 

FUNCTION 

= 

.8568013 

3 

CLUSTER 

NO. 

= 

1* 

SURVIVAL 

FUNCTION 

= 

.1609325-005 

262 

CLUSTn 

NO. 

2 

15 

survival 

FUNCTION 

.1137623 

28 

(CONTINUED) 
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TABLE  7. IX  (CONTINUED) 

ISOLATION  INDEX 


CLUSTER 

NO. 

= 

16 

SURVIVAL 

FUNCTION 

• 

•5202626-001 

38 

CLUSTER 

NO. 

= 

17 

SURVIVAL 

function 

- 

.1236796-005 

281 

CLUSTER 

NO. 

18 

SURVIVAL 

FUNCTION 

- 

.813*705 

4 

CLUSTER 

NO. 

= 

19 

SURVIVAL 

FUNCTION 

r 

.1868028 

13 

CLUSTER 

NO. 

3 

2C 

SURVIVAL 

FUNCTION 

- 

.2*89062 

16 

cluster 

NO. 

- 

21 

SURVIVAL 

FUNCTION 

- 

.925*659 

1 

CLUSTER 

NO. 

- 

22 

SURVIVAL 

function 

- 

.3250551-001 

33 

CLUSTER 

NO. 

• 

27 

SURVIVAL 

FUNCTION 

z 

.6*1422*7-002 

96 

CLUSTER 

NO. 

29 

SURVIVAL 

FUNCTION 

z 

.6607276 

8 

CLUSTER 

NO. 

s 

25 

SURVIVAL 

FUNCTION 

Z 

.3638257 

13 

cluster 

NO. 

= 

26 

survival 

FUNCTION 

z 

.85608*9 

3 

cluster 

NO. 

= 

27 

SURVIVAL 

FUNCTION 

• 

.6270218-001 

53 

CLUSTER 

NO. 

3 

28 

SURVIVAL 

FUNCTION 

3 

.*153132-002 

104 

CLUSTER 

NO. 

s 

2® 

SURVIVAL 

FUNCTION 

3 

.83*6999 

1 

CLUSTER 

NO. 

= 

30 

survival 

FUNCTION 

r 

•1916807 

8 

CLUSTER 

NO. 

= 

31 

SURVIVAL 

FUNCTION 

r 

.695*670-003 

92 

CLUSTER 

NO. 

22 

SUeVlV*L 

FUNCTION 

3 

.5*5*256 

4 

CLUSTER 

NO. 

33 

survival 

FUNCTION 

3 

.1606063 

35 

CLUSTER 

NO. 

= 

3® 

survival 

FUNCTION 

3 

.2*57201-003 

79 

CLUSTER 

NO. 

m 

3* 

survival 

FUNCTION 

3 

.627*4*5 

3 

CLUSTER 

NO. 

z 

36 

survival 

FUNCTION 

3 

.2229120 

4 

CLUSTCR 

NO. 

4* 

37 

SURVIVAL 

FUNCTION 

3 

.2197*62 

29 

(CONTINUED) 
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TABLE  7. IX  (CONTINUED) 


VARIABLES  CLUSTERED 


CLUSTER 

NO* 

- 

TP 

SURVIVAL 

FUNCTION  . 

- 

.2980232-006 

207 

CLUSTER 

NO. 

- 

39 

SURVIVAL 

FUNCTION 

z 

.13328^5-001 

14 

o  CLUSTER 

NO. 

- 

4C 

SURVIVAL 

FUNCTION 

- 

.4812570 

7 

••]  CLUSTER 

NO. 

01 

SURVIVAL 

FUNCTION 

z 

.5071954-001 

19 

& 

CLUSTER 

NO. 

- 

4  2 

SURVIVAL 

FUNCTION 

z 

.1750886-004 

203 

£  CLUSTER 

NO. 

- 

43 

survival 

FUNCTION 

z 

.1288421-001 

13 

s  CLUSTER 

NO  • 

44 

SURVIVAL 

FUNCTION 

z 

.6755485 

3 

8 

■  CLUSTER 

NO. 

- 

45 

SURVIVAL 

FUNCTION 

- 

.2078058 

10 

9  CLUSTER 

NO. 

46 

SURVIVAL 

function 

z 

.2908707-004 

193 

CLUSTER 

NO. 

= 

47 

*  SURVIVAL 

FUNCTION 

Z 

.7163716-002 

62 

R  CLUSTER 

NO. 

= 

4? 

SURVIVAL 

function 

z 

.9<>47246-001 

6 

H  CLUSTER 

NO. 

•m 

49 

survival 

FUNCTION 

Z 

.3642718-001 

18 

$  - 

CLUSTER 

NO. 

• 

sr 

survival 

FUNCTION 

z 

.1591002 

10 

||  CLUSTER 

NO. 

=  . 

51 

SURVIVAL 

function 

- 

.1291833 

5 

CLUSTER 

NO. 

= 

5? 

SURVIVAL 

FUNCTION 

z 

.2543985 

3 

«  CLUSTER 

NO. 

- 

t  T 

m0  . 

SURVIVAL 

function 

z 

.9631157-002 

22 

9  CLUSTER 

NO. 

= 

5*» 

SURVIVAL 

FUNCTION 

- 

.5627866-001 

6 

CLUSTER 

NO. 

55 

SURVIVAL 

FUNCTION 

- 

.1341105-J06 

692 

3  CLUSTER 

NO. 

- 

56 

SURVIVAL 

FUNCTION 

z 

•49966#7 -Q32 

20 

«  CLUSTER 

NO. 

«■> 

57 

SURVIVAL 

FUNCTION 

z 

.5066395-006 

136 

ll 

CLUSTER 

NO. 

58 

SURVIVAL 

FUNCTION 

Z 

.1639128-005 

167 
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CLUSTER 

NO. 

z. 

59 

SURVIVAL 

FUNCTION 

z 

.1273496 

ISOLATION  INDEX 
6 

CLUSTER 

NO. 

- 

60 

SURVIVAL 

FUNCTION 

z 

.4329841 

2 

CLUSTER 

NO. 

- 

61 

SURVIVAL 

FUNCTION 

- 

•1490116-006 

36 

CLUSTER 

NO. 

- 

62 

survival 

FUNCTION 

- 

.1341105-006 

110 

cluster 

NO. 

- 

63 

survival 

FUNCTION 

- 

.1034689-031 

55 

CLUSTER 

NO. 

- 

69 

survival 

FUNCTION 

- 

.3669095 

3 

CLUSTER 

NO. 

- 

65 

SURVIVAL 

FUNCTION 

- 

.2089143-004 

23 

CLUSTER 

NO. 

- 

66 

SUPVIVAL 

function 

z 

.1516947-001 

37 

cluster 

NO. 

67 

survival 

FUNCTION 

- 

•1882628-002 

12 

CLUSTER 

NO. 

= 

68 

SUPVIVAL 

FUNCTION 

.1192093-006 

75 

CLUSTER 

NO. 

6? 

SURVIVAL 

FUNCTION 

z 

.10433® 1 -CO© 

299 

CLUSTER 

NO. 

= 

70 

SURVIVAL 

FUNCTION 

z 

•1559631 

13 

CLUSTER 

NO. 

= 

71 

survival 

function 

z 

.3577756 

12 

CLUSTER 

NO. 

- 

7“ 

survival 

FUNCTION 

z 

^  t  r  ^ 

•  C  W  jf  *  L< 

7 

CLUSTER 

NO. 

r 

73 

survival 

FUNCTION 

- 

•  686734  9 -Cn  3 

28 

CLUSTER 

NO. 

- 

7u 

SUPVIVAL 

function 

z 

•  25331*7-0 '’i 

242 

CLUSTER 

NO. 

r 

7? 

SUPVIVAL 

function 

• 

z 

» 1 04  308 1 -036 

250 

CLUSTER 

NO. 

- 

76 

survival 

function 

z 

.8940697-007 

19 

CLUSTER 

NO. 

- 

77 

SUPVIVAL 

function 

• 

.8940697-007 

178 
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7.3.3  Cluster  Extraction  Results  --  Buoy  3  and  Buoy  1 

A  cut-off  point  of  10"^  was  used  for  the 
survival  function  to  determine  which  clusters  were  passed  to 
the  extraction  module  for  further  processing.  For  buoy  1,  the 

following  cluster  numbers  survived  this  test:  35,  38,  43,  44, 

45,  48,  53,  54,  55,  58,  59,  60,  65,  67,  71,  72,  and  73.  For 

buoy  3,  these  cluster  numbers  were  chosen:  4,  14,  17,  38,  42, 

46,  55,  57,  58,  61,  62,  65,  68,  59,  74,  75,  76,  and  77. 


i 
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o 
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The  polarization  test  (see  section  3.4) 
eliminated  from  further  consideration  clusters  38,  53,  and  55 
from  buoy  1  and  clusters  4,  14,  17,  42,  46,  58,  and  74  from 
buoy  3.  The  regression  test  and  the  exclusion  inclusion  rules 
resulted  in  cluster  67,  71,  and  72  being  extracted  for  buoy  1 
and  clusters  55,  69,  and  75  being  extracted  for  buoy  2.  Thus 
for  buoy  1,  the  clustering  and  extraction  processes  were  able 
to  completely  separate  the  data  into  three  clusters  containing 
only  the  measurements  belonging  to  a  given  target .  For  buoy  3 
the  same  processing  techniques  produced  two  of  the  required 
data  clusters  but  not  the  third,  selecting  cluster  69  rather 
than  76.  From  Table  7. VII,  it  can  be  seen  that  cluster  76 
contains  clusters  69  and  74  as  subsets.  The  polarization  test 
rejected  cluster  74  as  a  potential  candidate  and  the  regression 
test  rejected  the  consistency  hypothesis  of  69  and  74.  Thus, 
only  the  last  two-thirds  of  the  data  for  target  2  which  was 
contained  in  cluster  69  was  selected  for  further  consideration. 

7.3.4  Initial  Guess  Results 


Table  7.X  gives  a  listing  of  the  clusters 
extracted  for  each  buoy  along  with  an  identification  of  which 
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Buoy  Number 


Cluster  Number 


Target  Number 


3 


1 

2 

3 


3 

2 
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target  is  represented  by  each  cluster.  Allowing  a  track  to  be 
composed  of  both  two-cluster  and  three-cluster  intersensor 
combinations ,  there  are  then 

(3*3  +  3*3  +  3*3)  +  (3*3*3)  -  54 

potential  target  tracks.  Only  twelve  of  these  potential 
inter-sensor  combinations  correspond  to  actual  tracks.  Table 
7. XI  lists  the  results  from  the  initial  guess  algorithm.  From 
this  table  it  can  be  seen  that: 

1)  Of  the  twelve  cluster  combinations 

corresponding  to  actual  tracks ,  only  1  was 
rejected  (marked  by  the  asterisk) , 

2)  Of  the  42  cluster  combinations  corresponding 
to  pseudo  tracks,  29  (69%)  were  rejected, 

3)  Five  of  the  three-cluster  combinations  were 
not  even  considered  because  at  least  two  of 
the  associated  two-cluster  combinations  had 
been  rejected,  and 

4)  From  Figure  5.1,  several  of  the  pseudo 
estimates  can  be  seen  to  correspond  quite 
closely  to  the  illustrated  line-of-sight 
crossed-bearing  intersections. 

Once  the  potential  track  set  has  been  pruned  by  the  initial 
guess  procedure,  the  resultant  state  vector  estimates  and  their 
associated  cluster  combinations  are  then  passed  to  the  integer 
programming  module. 


Buoy  1  Buoy  2  Buoy  3  Accepted 

(Cluster/  (Cluster/  (Cluster/  or 

Target)  Target)  Target)  Rejected 
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7.3.5  Integer  Programming  Results 

Figure  7.2  gives  the  integer  programming  problem 
to  be  solved  for  this  scenario  (see  Section  6).  Note  that  the 
number  of  potential  tracks  represented  in  the  objective 
function  (i.e.,  the  number  of  columns)  is  fewer  than  the  number 
passed  by  the  initial  guess  module.  This  happens  because  those 
tracks  with  positive  costs,  as  computed  by  the  HTA,  need  not  be 
considered  since  they  would  never  be  part  of  the  solution  to 
this  minimization  problem.  This  can  significantly  reduce  the 
amount  of  time  spent  finding  an  optimum  since  the  solution 
algorithm  spends  a  lot  of  time  arranging  and  sorting  the 
columns  . 


The  solution  found  by  the  algorithm  is 
*13  "  X14  *  X16  "  1 

X1  "  X2  “  •••  X12  ’  X15  ■  Sj  -  S2  -  ...  S4  -  0. 

Using  Figure  7.2  and  Table  7.X  this  solution  is 
seen  to  correctly  correspond  to: 

-  three  sensor  data  for  target  3  (1,  1,  1), 

-  three  sensor  data  for  target  1  (2 ,  2 ,  3) , 

X16  -  three  sensor  data  for  target  2  (3,  3,  2). 

7.3.6  Conclusion 

The  MTTA  correctly  sorted  the  raw  sensor  data, 
correctly  eliminated  about  70Z  of  the  possible  pseudo-tracks , 
and  then  identified  the  actual  tracks.  For  this  scenario. 

Figure  7.1  illustrates  the  estimated  tracks  generated  by 
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Figure  7.2.  Integer  Programming  Equation  for  Scenario 


the  MTTA  and  the  actual  trajectories  followed  by  the  targets. 
The  "tic"  marks  in  each  true  trajectory  line  represent  two 
minute  intervals.  All  the  estimated  trajectories  have  come 
fairly  close  to  the  actual  tracks  before  the  first  tic  mark. 
The  early  severe  fluctuations  are  a  direct  result  of  the  batch 
initializer's  updating  scheme.  After  an  initial  state  vector 
estimate  is  obtained  for  t"0,  a  Kalman  filter  is  used  to  update 
the  state  to  the  current  time.  The  filter  is  applied  with  a 
large  initial  covariance  matrix  which  allows  it  to  search  over 
a  wide  area  to  update  the  state  estimate.  Typically,  early  in 
the  track,  only  one  buoy  supplies  the  first  three  or  four 
measurements,  and  this  limited  observation  geometry  coupled 
with  the  large  covariance  matrix  results  in  state  updates  which 
are  substantially  different  from  the  actual  ones.  In  all 
cases,  the  Initial  state  vector  estimates  at  t-0,  and  the 
updated  estimates  passed  to  the  sequential  filter  after 
initialization  were  quite  close  to  the  true  values.  In  any 
event,  the  tracking  accuracy  is  a  result  of  HTA 
characteristics,  not  MTTA  performance,  given  that  MTTA  has 
correctly  solved  the  inter sensor  data  matching  problem. 

The  large  fluctuations  observed  for  target  2  in 
Figure  7.1  are  due  to  the  fact  that  for  buoy  3,  the  cluster 
extraction  algorithm  did  not  put  the  first  third  of  the  T2 
(target  2)  data  set  with  the  last  two-thirds.  Thus,  buoy  2  by 
itself  supplied  the  first  few  measurements  for  target  2.  This 
failure  of  the  extraction  algorithm  can  be  traced  to  the 
linearity  assumption  of  the  testing  mechanism;  target  2  went 
through  CPA  relative  to  buoy  3  resulting  in  severe  nonlinear 
frequency  and  bearing  changes.  The  first  part  of  the  T2  data 
set  contained  the  Doppler  compression  data  up  through  CPA,  the 
last  part  consisted  of  the  Doppler  expansion  data  for  T2  that 
was  found  after  CPA.  The  cluster  extraction  algorithm  rightly 
viewed  these  two  data  streams  as  different  processes,  and 

concluded  that  they  did  not  go  together.  It  is  felt  that  this 


problem  can  be  remedied  by  Including  quadratic  terms  in  the 
extraction's  regression  model  to  compensate  for  the  nonlinear 
frequency  and  bearing  changes  encountered  during  CPA. 


7 .4  Three  Target  Scenario  with  Bad  Geometry  Low 

Noise ,  0  dB  threshold  in  1  Hz  Band 

Figure  7.3.  shows  the  scenario  discussed  in  this 
section.  Table  7.11  lists  the  true  Initial  and  final 
conditions  for  each  of  the  targets  in  this  scenario.  The 
trajectories  simulated  are  the  same  as  those  modeled  for  the 
scenario  discussed  in  section  7.3  except  that  target  1  has  a 
forty** five  degree  course  heading  Instead  of  the  fifteen  degree 
heading  that  was  used  previously. 

From  Figure  7.3,  it  can  be  seen  that  the  MTTA 
provides  reasonable  tracks  for  targets  1  and  2,  but  no  track  at 
all  is  generated  for  target  3.  Tables  7. XII  and  7. XIII  list 
the  data  and  the  clusters,  respectively,  selected  for  buoy  1, 
while  Tables  7. XIV  and  7. XV  list  the  same  information  for  buoy 
3.  Buoy  2  sees  all  targets  perfectly.  From  Tables  7. XII  and 
7. XIII,  it  can  be  seen  that  buoy  1  sees  target  3  perfectly, 
however,  it  gets  targets  1  and  2  confused  and  splices  the  last 
ten  points  from  target  1  onto  the  target  2  observation  set.  A 
similar  phenomena  occurs  for  buoy  3  when  the  first  10 
observations  from  target  2  are  interspersed  among  those  of 
target  1.  Buoy  3  also  sees  target  3  perfectly.  Looking  at  the 
data,  it  is  clear  that  the  measurements  become  quite  similar 
during  these  times  and  the  clustering  algorithm  simply  cannot 
distinguish  between  the  right  and  wrong  data  sets.  However, 
when  the  power  levels  of  the  various  signals  are  examined,  a 
clear  difference  is  exposed.  The  choice  of  attribute  variables 
used  in  this  study  ignores  signal  power  and  consequently 
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treats  all  measurements  as  If  they  were  received  at  a  common 
power  level.  However,  the  propagation  losses  for  a  near  target 
should  be  much  smaller  than  those  for  a  distant  target,  so 
clearly  the  received  signal  powers  should  vary  according  to  the 
proximity  of  the  target  to  the  receiver.  Other  studies 
currently  under  way  at  Tracor  indicate  the  power  levels  can  be 
clustered  by  applying  an  intensity  index  to  the  power  level  and 
then  clustering  with  this  index.  It  is  felt  that  application 
of  a  similar  methodology  to  DIFAR  data  could  improve  the 
ability  of  the  algorithm  to  separate  data  from  two  targets  that 
are  close  in  bearing  and  frequency,  but  differ  in  power  levels. 

7 .5  Three  Target  Scenario  with  Good  Geometry, 

Moderate  Noise,  No  Threshold 


Geometrically,  this  scenario  is  the  same  as  the 
one  discussed  in  section  7.3.  The  difference  between  the  two 
scenarios  is  in  the  simulated  source  power  levels.  For  the 
first  scenario,  there  was  an  80  dB  source  level,  but  for  this 
scenario,  the  source  level  was  reduced  to  77  dB.  Figure  7.4 
provides  graphical  comparison  of  the  target  true  tracks  versus 
the  MTTA's  estimates  of  these  tracks. 

This  particular  scenario  was  simulated  to  test 
the  following  two  data  processing  capabilities  of  the  MTTA: 

1)  The  ability  of  the  clustering  and  automatic 
cluster  extraction  algorithms  to  separate 
signals  from  random,  background  noise,  and 

2)  The  ability  of  the  MTTA’s  HTA  and  IP  to  pick 
the  correct  targets  and  to  adequately  track 
them  with  noisy  data  gathered  from  moderate 
strength  target  signals. 
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As  stated  previously,  the  3  dB  drop  in  the  simulated  target 
signal  strength  resulted  in  an  approximate  30X  decrease  in  the 
range  over  which  the  target  could  be  readily  detected.  For 
this  scenario,  each  o£  the  targets  could  be  well  detected  by 
one  (i*e,  the  closest)  sonobuoy,  each  could  be  fairly  well 
detected  by  a  second  sonobuoy,  but  the  targets  could  just  be 
marginally  detected  by  the  sonobuoy  farthest  from  the  signal 
source.  Particularly  if  a  0  dB  threshold  was  used  for 
detection  by  the  sonobuoys,  each  target  would  be  nearly 
unobservable  for  one  sonobuoy  in  the  pattern.  To  compensate 
for  any  possible  loss  of  data,  it  was  decided  to  drop  the 
threshold  criterion  and  to  let  the  clustering  algorithm  sort 
the  time  signals  from  the  random  noise. 


Overall,  the  MTTA  performed  very  well  in 
processing  the  noisy  data.  For  the  most  part,  random  noise  was 
eliminated  from  the  true  signals  by  the  clustering  and 
automatic  extraction  processes.  Most  of  the  signal  clusters 
contained  a  couple  of  noise  points  that  were  similar  to  the 
true  measurement  set,  but  that  did  not  really  belong  with  the 
true  data.  However,  the  DIFAR  simulator  computes  a  measurement 
standard  deviation  that  is  a  function  of  the  estimated  SNR  of 
the  received  signal.  For  the  spurious  noise  samples,  the 
estimated  SNR  values  were  very  small  (i.e. ,  less  than  0  dB) ,  so 
a  large  variance  was  computed  for  the  frequency  and  bearing 
estimates.  The  HTA  uses  the  measurement  variance  to  weight  the 
measurement  before  generating  a  least  squares  update  of  the 
target  track.  When  large  variances  are  provided  with 
measurement  data,  the  tracker  essentially  weights  the  data  so 


Tracer  Appied  Sciences 


that  they  can  cause  little  or  no  change  in  the  track  update. 
The  net  effect  is  that  when  only  a  few  poor  measurements  are 
provided  to  the  HTA  with  variances  that  indicate  the  data 
should  carry  minimal  weight,  the  HTA  can  continue  to  track  the 
target  without  seriously  degrading  the  track  updates.  The 
results  from  this  scenario,  presented  in  Figure  7.4,  illustrate 
this  point  well. 

From  the  figure,  it  can  be  seen  that  after  some 
fluctuations  during  the  track  initialization  phase,  excellent 
track  estimates  were  obtained  for  targets  2  and  3.  For  target 
1,  somewhat  poorer  estimates  of  the  trajectory  were  obtained. 
Part  of  the  problem  with  this  estimate  is  that  target  1  is 
approaching  buoy  3  nearly  head-on  from  a  very  long  distance. 
Because  the  target  is  heading  nearly  directly  at  the  buoy,  very 
little  bearing  change  would  be  observed  even  for  a  well 
received  signal.  Since  the  signal  is  only  of  moderate  strength 
and  is  fairly  distant  from  the  sonobuoy  receiver,  the 
propagation  losses  are  substantial  enough  to  cause  the  signal 
to  be  poorly  received.  Since  the  signal  is  so  poorly  detected, 
serious  fluctuations  result  in  the  bearing  measurements  which 
may  in  turn  appear  to  the  tracker  to  be  a  significant  bearing 
rate.  Hence,  even  though  buoy  3  does  observe  target  1  and  the 
clustering  and  cluster  extraction  processes  adequately  sort  the 
data,  the  measurements  from  buoy  3  are  still  of  such  poor 
quality  that  they  adversely  affect  the  tracking  solutions  from 
the  MTTA  for  target  1.  Nonetheless,  these  tracking  solutions 
for  all  the  targets  would  probably  be  adequate  for  real  world 
encounters . 


In  conclusion,  the  MTTA  seems  to  have  passed  the 
two  tests  that  this  scenario  was  designed  to  examine.  First, 
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it  did  an  adequate  job  of  separating  noise  from  true  signals. 
Second,  the  MTTA  did  an  adequate  job  of  tracking  the  targets 
with  the  poorer  quality  data  generated  for  this  examination. 
However,  once  again,  bad  observational  geometry  adversely 
affected  the  MTTA's  estimates,  perhaps  even  more  so  than  the 
degraded  quality  of  the  data. 

7.6  Two  Target  Scenario  with  Good  Geometry,  Low  Noise, 

and  0  dB  in  a  1  Hz  Band  Threshold 


The  last  case  considered  by  this  study  was  a 
relatively  simple,  two  target  scenario.  The  same  tri-tac  buoy 
pattern  was  used  to  observe  the  targets,  and  favorable 
geometries  and  low  background  noise  levels  were  used  to 
simulate  this  scenario.  Figure  7.5  contains  a  plot  of  the  true 
target  trajectories  versus  the  MTTA's  estimates  of  the  tracks 
for  this  two-target  scenario.  While  the  tracking  solutions 
presented  in  the  plots  show  good  results ,  one  interesting  point 
was  encountered  in  generating  these  results  that  needs  further 
discussion.  For  target  2,  the  MTTA  picked  only  a  2-cluster, 
Intersensor  combination  to  track  the  target  rather  than  the 
correct  3-cluster  combination  that  would  have  been  preferred. 
The  remaining  third  cluster  was  left  unused,  and  the  MTTA  did  a 
good  job  of  tracking  the  target  without  the  data  from  this 
cluster.  One  must  question  why  the  third  cluster  describing 
target  2  was  excluded,  however,  and  so  the  results  were  more 
closely  examined  to  find  the  cause  of  this  exclusion. 

For  this  scenario ,  each  sonobuoy  should  have 
generated  two  different  data  clusters,  one  cluster  to  represent 
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each  of  the  two  targets.  The  clustering  and  extraction 
algorithms  did  a  perfect  job  of  sorting  the  data  into 
individual  target  data  sets  for  each  of  the  sonobuoys.  Next, 
the  data  clusters  were  passed  to  the  initial  guess  (IG) 
algorithm  to  eliminate  unlikely  cluster  combinations  and  to 
keep  the  rest.  Since  the  data  were  simulated  for  this 
scenario,  it  was  known  that  the  correct  cluster  combination  for 
target  1  should  have  been  (1,  2,  1),  meaning  the  first  cluster 
from  sensor  1,  the  second  cluster  from  sensor  2,  and  the  first 
cluster  from  sensor  3  should  be  combined  to  track  target  1. 
For  target  2,  the  correct  three-cluster  combination  should  have 
been  (2,  1,  2).  The  IG  correctly  passed  all  of  the  correct 

2- cluster  combinations  and  the  single  3 -cluster  combination  for 
target  1.  It  also  passed  all  of  the  correct  2-cluster 
combinations  for  target  2,  but  the  IG  rejected  the  one 

3- cluster  combination  that  should  have  been  passed  to  represent 

target  2.  Apparently,  the  3-cluster  statistical  test  found  too 
much  statistical  difference  between  this  estimate  and  the 
2-sensor  combinations  that  make-up  this  cluster  triplet. 
Because  of  this  rejection  by  the  IG,  the  preferred  3-cluster 
combination  was  rejected  before  the  MTTA's  IP  could  even 
consider  it.  Without  this  cluster  triplet,  the  IP  could  only 
pick  the  single  2-sensor  combination  that  minimized  the  cost 
function  and  leave  the  remaining  cluster  unaccounted  This 

proved  to  be  exactly  what  happened  when  the  MTTA  pickeu  opt  nal 
solutions  of  (1,  2,  1)  for  target  1  and  (2,  1,  0)  for  tar^  2 
and  left  the  second  cluster  for  sensor  3  unused. 


1 


a 


1 


“  1 
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In  conclusion,  the  MTTA  once  again  performed  well 
in  tracking  the  two  targets  in  this  scenario  as  can  be  seen  in 
Figure  7.5.  Unfortunately,  the  MTTA's  IG  module  prematurely 
eliminated  the  cluster  combination  that  would  have  been 
preferred  to  track  target  2.  This  one  case  is  the  sole 
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example  in  all  the  multi-target  scenarios  studied  where  the  IG 
failed  to  pass  the  preferred  3 -sensor  solution.  It  should  be 
remembered,  however,  that  any  statistically  based  test  is  going 
to  make  a  wrong  decision  part  of  the  time,  even  though 
hypothesis  tests  are  constructed  to  minimize  type  I  errors. 
The  result  is  that,  occasionally,  good  initial  guess  estimates 
are  going  to  be  eliminated  purely  by  chance. 
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8.0  SUMMARY  AND  RECOMMENDATIONS 

8.1  Project  Summary 

The  scenarios  discussed  in  Section  7.0  have  shown 
that  in  its  present  form  the  MTTA  possesses  the  ability,  with 
no  required  operator  interventions,  to  properly  sort  passive 
sonobuoy  data  into  individual  target  data  sets ,  correlate  those 
data  sets  across  sensors,  and  select  the  correct  target  track 
scenario  from  the  hundreds  of  potential  tracks.  Additionally, 
in  its  current  configuration,  the  algorithm  has  performed  in 
near  real  time  and,  with  minimal  processing  alterations,  could 
operate  substantially  faster  than  real  time. 

Simulation  results  showed  the  importance  that 
scenario  geometry  played  in  correctly  identifying  appropriate 
tracks.  Under  certain  circumstances,  the  clustering  algorithm 
can  become  confused  over  target  data  and  group  together  points 
from  different  targets  having  substantially  different  power 
levels.  However,  even  in  the  worst  case  studied,  two  of  the 
three  possible  tracks  were  estimated  fairly  well.  Results 
based  on  good  geometry  but  moderate  to  high  background  noise 
levels  Indicate  that  the  MTTA  is  not  as  sensitive  to  noise 
levels  as  it  is  to  bad  geometry.  This  implies  that  the 
clustering  and  extraction  algorithms  are  confused  less  by 
measurements  corrupted  by  noise  than  by  similar  measurements 
coming  from  different  targets.  Remedies  to  these  problems  will 
be  discussed  in  the  next  section. 

8.2  Recommendations  for  Future  Work 


The  recommendations  for  future  work  contained  in 
this  report  fall  into  two  natural  categories :  enhancements  to 
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correct  the  deficiencies  observed  in  the  MTTA  under  the  current 
contract,  and  extensions  required  to  make  the  MTTA  able  to 
perform  for  long  periods  of  time. 

Under  the  current  contract,  the  following 

deficiencies  were  detected  in  the  MTTA: 

1)  Confusion  of  targets  under  certain 

geometries,  even  when  the  power  levels  of 
the  signals  were  different,  and 

2)  Incorrect  cluster  extraction  when  targets 
were  undergoing  CPA. 

To  remedy  the  first  problem,  it  is  felt  that  the  estimated 
power  level  can  be  converted  to  an  intensity  index,  and  this 
index  can  then  be  used  as  an  added  attribute  to  sort  the  data 
by  clustering  techniques.  The  reason  for  converting  the  power 
level  to  index  form  is  that  typically  the  raw  power-level 
estimates  vary  too  much  to  be  useful,  but  indexing  such  as  is 
done  for  visual  displays  has  proven  to  be  useful  in  other 
studies.  The  second  problem  can  be  corrected  by  adopting  a 
curvilinear  model  instead  of  a  linear  one  for  the  extraction 
regression  equations.  This  should  permit  the  model  to  more 
closely  approximate  the  nonlinear  measurement  time  histories 
observed  during  CPA. 

Proposed  extensions  to  the  MTTA  are  essentially 
based  on  the  nature  of  the  multi-target  tracking  problem 
itself.  Basically,  the  problem  can  be  divided  into  two  parts. 
The  first  part  consists  of  recognizing  new  targets  and 
initializing  their  tracks,  and  the  second  part  consists  of 
updating  tracks  for  established  targets.  Tracor 's  MTTA,  as  it 
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presently  stands,  is  essentially  a  solution  to  the  first  part 
of  the  problem.  It  can  take  raw  data,  automatically  cluster  to 
recognize  targets,  and  then  estimate  the  associated  tracks  for 
each  of  the  targets. 

Figure  8.1  represents  Tracor's  conception  of  the 
completed  multitarget  algorithm.  The  current  MTTA  is  located 
in  the  "search  for  new  targets..."  box.  The  boxes  surrounding 
it  contain  data  management  functions .  They  maintain  sensor 
data  pools,  place  elements  in  the  current  track  table,  and 
remove  points  forming  a  trajectory  from  the  sensor  data  pools. 
There  is  no  theoretical  development  associated  with  the 
implementation  of  these  functions,  they  amount  to  the  lines  of 
code  necessary  to  set  up  the  appropriate  data  structures  and 
module  linkages. 

The  thrust  of  the  theoretical  development  will  be 
concentrated  on  determining  if  a  given  data  point  is  associated 
with  one  or  more  current  tracks  and  then  performing  the 
appropriate  update.  There  are  several  potential  candidate 
strategies  for  attacking  this  problem  Including: 

1)  Nearest  neighbor  and  gating  approaches  of 
Sea  (Reference  18)  and  Singer  and  Sea  (Refer¬ 
ence  19). 

2)  A  posteriori  analysis  of  track  density  by 
Sittler  (Reference  20)  and  Stein  and 
Blackman  (Reference  21) , 

3)  Probability  data  association  scheme  used  by 
Bar-Shalom  and  Tse  (Reference  22) ,  and 

4)  Cluster-based  data  association  schemes. 
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At  this  point,  the  thinking  is  that  track  determination  and 
state  update  will  be  handled  by  a  combination  of  (1)  and  (4)  in 
conjunction  with  the  existing  sequential  filter  used  by  the 
HTA.  Recent  work  (Reference  23)  indicates  that  (1)  is  among 
the  most  widely  used  of  all  measurement  classification 
schemes .  Coupled  with  a  gating  mechanism  at  the  clustering 
stage,  it  seems  like  a  very  promising  approach.  Additionally, 
to  handle  the  problem  of  intersecting  targets,  it  may  very  well 
be  appropriate  to  allow  given  measurement  to  update  more  than 
one  track  but  with  a  substantially  reduced  measurement  weight 
for  each  track.  This  is  similar  in  philosophy  to  use  of  the 
overlapping  cluster  algorithm  to  sort  data. 

In  summary,  it  is  felt  that  the  most  difficult 
part  of  the  multi-target  tracking  problem,  target  recognition 
and  track  initialization,  has  largely  been  solved.  The 
literature  contains  a  broad  range  of  strategies  for  updating 
tracks  once  initialization  has  occurred,  but  up  until  now, 
there  has  been  little  work  done  with  the  problem  of 
multi-target  track  initializations.  It  is  believed  that 
Tracor' s  MTTA  provides  the  basis  for  solving  this  total 
problem,  and  it  is  recommended  that  its  development  be 
continued . 
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